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Abstract 

We report complexity results about redundancy of formulae in 2CNF form. We 
first consider the problem of checking redundancy and show some algorithms that 
are slightly better than the trivial one. We then analyze problems related to finding 
irredundant equivalent subsets (i.e.s.) of a given set. The concept of cyclicity proved 
to be relevant to the complexity of these problems. Some results about Horn formulae 
are also shown. 



1 Introduction 

The complexity of some problems related to the redundancy of propositional CNF formulae 
has been studied in a previous paper [Lib05]. The motivations for studying redundancy 
can be summarized as follows: first, removing redundancy from a knowlegde base makes it 
simpler without changing much its structure; second, the presence of redundant parts in a 
knowledge base can be a sign of importance of the represented concept, but can also be a 
sign of mistakes in the formulation of the knowledge base [Lib05]. Related to the problem 
of redundancy of CNF formulae are the redundancy of production rules [Gin88, SS97], the 
minimiziation of CNF and Horn formulae [MS72, Mai80, ADS86, HK93, HW97, Uma98], the 
redundancy of literals in a clause [GF93], the redundancy for non-monotonic logics [Lib], the 
equivalence and extension-equivalence of irredundant formulae [BZ05], and the problem of 
minimal unsatisfiability [PW88, FKS02, Bru03], which is the special case of irredundancy of 
inconsistent formulae. A comparison between the problem of redundancy and related work 
is given in the paper where redundancy of general CNF formulae is studied [Lib05]. 

In this paper, we study the complexity of problems related to the redundancy of formulae 
in 2CNF and Horn form. Most of the results are about the 2CNF form, as the corresponding 
problems for the Horn case are either trivial or have proofs of complexity that coincide with 
the corresponding ones for the 2CNF form. 

The first problem we consider is that of checking the redundancy of a 2CNF formula. 
Since a formula is redundant if and only if it is equivalent to one of its subsets and checking 
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equivalence for 2CNF formulae is polynomial, the problem is polynomial. We slightly improve 
over the trivial algorithm by showing that redundancy can be checked in time 0(nm), where 
n is the number of variables and m is the number of clauses of the formula. 

The other problems we consider are about the irredundant equivalent subsets (i.E.S.) 
of a formula. In particular, the following problems are easily shown to be polynomial for 
formulae in 2CNF: check whether a formula is an I.E.S. of another one; check whether a 
clause is in all I.E.S. 's of a formula; and check whether a formula has an unique I.E.S.. The 
last two problems are polynomial thanks to the following results [Lib05] : a clause 7 is in all 
I.E.S. 's of a formula II if and only if II\{7} |= 7; a formula II has an unique I.E.S. if and 
only if {7 e II I n\{7} y= 7} |= II. Combined with the fact that inference is polynomial for 
2CNF clauses, these two results imply that checking the presence of a clause in an I.E. s. and 
the uniqueness of I.E.S. 's are polynomial problems. 

Two problems about I.E.S. 's require a more complicated complexity analysis: checking 
whether a clause is in at least an I.E.S. of a formula and checking whether a formula has an 
i.E.S. of size bounded by an integer k. The complexity of these two problems largely depend 
on the presence of cycles of clauses in the formula. Namely, if the formula contains a cycle 
of clauses, defined as a sequence of clauses [-1/1 V I2, -1I2 V I3, . . . , -<l n V li], these two problems 
are typically NP-complete, while they are polynomial if the formula does not contain cycles. 

The complexity analysis for the 2CNF form has been carried on separately for the cases 
in which the set of clauses is: 

1. inconsistent; 

2. consistent and implying some literals; 

3. consistent and not implying any literal. 

We prove that the clauses not containing implied literals and the clauses containing 
implied literals can be considered separately. More precisely, the problem of redundancy can 
be solved in two steps: 

1. check the redundancy in II of clauses I V /' such that II |= /; 

2. remove from II all clauses / V V such that II \= I, and check redundancy. 

A similar procedure can be used for problems about I.E.S. 's: we indeed prove that every 
i.E.S. of a consistent formula is composed of two parts, the second being an I.E.S. of the 
formula composed of the clauses of the formula not containing an implied literal. An I.E.S. 
of a formula can therefore be found by first finding an I.E.S. of this reduced formula and then 
checking which clauses have to be added to allow the derivation of literals that are implied 
by the original formula. 

The three conditions of inconsistent formulae, formulae implying literals, and formulae 
implying literals, require each a different analysis. Surprisingly, however, the complexity of 
the problems is usually the same in the three cases. Namely, the complexity of checking 
redundancy is always 0(nm) regardless of these conditions, while the complexity of the 
problems of presence in an I.E.S. and of the uniqueness of I.E.S. 's depend more on the 
presence of cycles in the formula than on the consistency or presence of implied literals. 
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2 Preliminaries 



In this paper, we study CNF formulae that are either in 2CNF or in Horn form. Given 
a set of propositional variables, a literal is a variable or a negated variable. A clause is a 
disjunction of literals; in particular, a unary/binary clause is a clause composed of one or two 
literals. An Horn clause is a clause containing at most one positive literal. A set of clauses 
containing only unary and binary clauses is a 2CNF formula. A set of clauses composed only 
of Horn clauses is an Horn formula. 

Redundancy of clauses and formulae are defined as follows. 

Definition 1 A clause 7 is redundant in a CNF formula II ifU\{'j} |= 7. 

This definition allows a clause 7 not in n to be classified as redundant in n. However, 
we are typically interested into the redundancy of clauses 7 6 II. Obviously, if 7 G n is 
irredundant in n, it is also irredundant in every n' C n. 

Definition 2 A CNF formula is redundant if it contains a redundant clause. 

In this paper we study the problem of checking the redundancy of 2CNF and Horn formu- 
lae, and some problems related to making a formula irredundant by eliminating redundant 
clauses. What results from this process is formalized by the following definition. 

Definition 3 ([Lib05]) An Irredundant Equivalent Subset (i.e.s.) of a CNF formula U is 
a formula II' such that II' C n, n' = n, and W is irredundant. 

Every formula has at least one i.e.s. An irredundant formula has a single I.E.S., which 
is the formula itself. A redundant I.E.S. can have a number of i.E.S.'s ranging from one to 
exponentially many [Lib05] . The following properties have been proved in a previous paper. 

Property 1 ([Lib05]) A clause 7 is in all I.E.S. 's of a formula II if and only if 7 is irre- 
dundant in n. 

Property 2 ([Lib05]) A formula II has a single I.E.S. if and only if {7 G n | n\{7} ^ 
7>}| II- 

We use the following notation for the literals that are entailed by a formula. 
Notation: n N = {/ | n |= /} 

The following notation for the clauses containing a literal is a set will be used. 
Notation: n|{/i, . . . , l m } = {7 \k e 7, 7 e n} 

We also use U\l = n|{/}, where / is a literal. 
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2.1 Unit Propagation 

The proofs of complexity of redundancy of 2CNF formulae are mostly done using unit prop- 
agation. The following lemmas show how entailment is related to unit propagation. We 
denote by \= R the derivation by resolution, and by \=up the derivation by unit propagation. 

In what follows, II denotes a 2CNF formula, i.e., a set of clauses, each composed at most 
of two literals. We assume that II does not contain the empty clause. By resolution trees we 
mean regular resolution trees; their root is labeled with a clause which is not necessarily _L. 
A well-known property of resolution is that of being a complete inference method for prime 
implicates: 

Property 3 For any set of clauses II and clause 7, it holds U |= 7 if and only if there exists 
7' C 7 such that II \= R 7'. 

When applied to binary clauses, this property can be reformulated as: 

n |= h V l 2 if and only if one of the following conditions hold: II \= R _L 

n>* h 
n k 
nh'iv k 

We show how resolution is related to unit propagation for 2CNF formulae. 

Lemma 1 For any 2 CNF formula II and two literals l x and l 2 , if II \= R l x V l 2 , then II U 
"Hi} \=up h- 

Proof. Since II \= R l x V l 2 there is a resolution tree rooted with l x V l 2 . We prove the lemma 
by induction on the height of the tree. The base case of recursion is when the tree is a leaf. 
In this case, li V l 2 G II, which implies that II U {~<li} \=up h- 

Let us therefore assume that the claim holds for any binary clause that can be proved 
with a tree of height k, and prove it for clauses requiring trees of height k + 1. Let us therefore 
consider a tree of height k + 1 and labeled with l\ V l 2 in the root. Its subtrees have height 
less than or equal to k, and their roots are marked with li V I3 and -1/3 V l 2 for some literal 
Z 3 ; note that resolution does not allow to derive l\ V l 2 from / x or from l 2 . 




o o 

h V h -^3 V h 

Since the resolution trees of l\ V l 3 and -1/3 V l 2 have both height less than or equal to k, 
by the induction hypothesis both LT U {-1/1} \=up h and II U {/ 3 } \=up h hold. As a result, 

n u Hi} Kp z 2 . □ 

The following lemma shows that inference over a single unit clause can be checked using 
unit propagation. 
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Lemma 2 For any 2CNF formula IT and literal I, if II \= R I then U U {->/} \=up -L. 

Proof. The claim is proved by induction on the height of the resolution tree rooted with I. 
In the base case, this tree is a leaf, and therefore / G II. The claim ITU {->/} \=up -L therefore 
holds. 

We now assume that the claim is true for any literal that is derivable from IT using a 
resolution tree of height less than or equal to k and prove that the same holds for height 
k + 1. Let I be the root of a resolution tree of of height k + 1. The root of this tree is /, and 
its children can be either both binary, or a unary clause resolved with a binary clause. Let 
us consider this latter case first. 



/ 

Q 




By induction, II U {"^i} \=up -L, as the subtree rooted with li have height less than or 
equal to k. By Lemma 1, ->l implies by unit propagation. As a result, II U {->/} \=up -L. 

Let us now consider the situation in which the children of I are both binary clauses. Let 
us call h the literal they are resolved upon, that is, the two clauses are I V h and I V — iZx- 




By Lemma 1, II U {-</} \=up h and II U \=up ->h- Since ->l allows deriving a pair of 
contradictory literals by unit propagation, we have II U {->/} \=up -L. □ 

We now use Property 3 to show how derivation is related to unit propagation. 

Lemma 3 For every 2CNF formula II and literals l\ and l 2) II \= / x V l 2 if and only ifU. is 
inconsistent or II U {^h} \=up L or II U {^l 2 } \=up L or II U {-^i} \=up h- 

Proof. The "if" direction is due to the fact the unit propagation is a sound (but not complete) 
entailment method, that is, if II \=up 7 then II |= 7. 
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The other direction is proved by applying Property 3. Indeed, II |= 7 implies that either 
II is inconsistent, or it implies by resolution li, or l 2 , or li V l 2 . In turns, II |=_r li implies 
II U {^h} \=up -L, and similarly for II \=r l 2 thanks to Lemma 2. Moreover, II 1=^ /1 V l 2 
implies II U {->h} \=up h thanks to Lemma 1. □ 

This lemma proves that unit propagation can be used to check whether a clause of two 
literals is implied by a consistent 2CNF formula. The following lemma is about inconsistent 
formulae. 

Lemma 4 A 2CNF formula II is inconsistent if and only if there exists a variable x such 
that LT U {x} \=up -L and II U {^x} \=up -L. 

Proof. The "if" direction is obvious, thanks to the soundness of unit propagation. 

Let us consider a minimal regular resolution tree for II. Its root is marked with _L, so its 
children have to be marked x and ->x for some variable x. Therefore, we have that LT |=r x 
and LT \= R -ix. By Lemma 2, the claim is proved. □ 



2.2 Formulae Implying Literals 

A consistent 2CNF formula LT can imply some literals or not. We show that, as long as 
redundancy and i.E.S.'s are concerned, we can threat the clauses containing an implied 
literal and those not containing them separately. We first formally prove that we can replace 
all clauses containing an entailed literal with the literal itself. This result is general to all 
formulae in CNF, and is a little obvious. 

Lemma 5 IfU is a CNF formula such that IT |= /, then II and n\(Il|/) U {/} are equivalent. 

Proof. Let M be a model of II. By definition, M satisfies all clauses of II. Since LT |= I, this 
model assigns true to /. Since n\(n|/) U {/} only contains clauses of II and /, it is satisfied 
by M. 

Vice versa, let M be a model of LI\(n|Z) U {/}. This model assigns true to I. Moreover, 
it satisfies all clauses of IT not containing I. Since it also satisfies all clauses of IT containing 
/ because it sets I = true, it satisfies all clauses of II. □ 

Replacing all clauses containing an entailed literal with the literal itself does not only 
preserve equivalence but also redundancy, for 2CNF formulae. 

Lemma 6 Let II be a formula implying the literal I. If the clause 7 G II does not contain I, 
then 7 is redundant in IT if and only if it is redundant in LI\(n|Z) U {/}. 

Proof. Since 7 does not contain /, it is both in II and in n\(II|/) U {/}. We therefore only 
have to prove that is entailed by II\{7} if and only if it is entailed by (n\(II|/) U {0)\{7}- 
We prove that these two formulae are equivalent. Since 7 is not a clause of LT|Z, we have 
that (II\(n|Z) U {/})\{7} is the same as (n\{7»\(n|Z) U {/}. By Lemma 5, this formula is 
equivalent to Il\{7}. □ 
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This lemma shows that clauses that do not contain literals that are entailed by the 
formula can be checked for redundancy after the clauses containing entailed literals have 
been removed. The following corollary is an obvious consequence of this lemma. 

Corollary 1 If U is a consistent 2CNF such that IT |= I, then II contains a redundant clause 
not containing I if and only ifU\(U.\l) U {/} is redundant. 

This lemma shows that, once we have determined that all clauses containing a literal 
/ such that II |= / are irredundant, we can replace all such clauses with /. Repeating this 
procedure for all literals implied by the formula, we obtain a formula with disjoint unit and 
binary clauses. This is because, if II |= /, then every clause / V /' is replaced by /, while every 
clause -1/ V /' is replaced by As a result, no binary clause contain a variable that is in a 
unit clause in the resulting formula. 

We now show a similar result about i.E.s.'s, proving that clauses containing literals 
entailed by the formula can be replaced by these literals. 

Lemma 7 Let II be a consistent 2CNF such that II |= I. //II' is an i.e. s. of II then 

n 2 = ir\(n|Z) u {/} is an i.e.s. /n\(n|Z) u {/}. 

Proof. Since IT' is an i.e.s. of II, we have that IT C II. As a result, IT\(II|Z) U {/} C 
n\(II|/) U {/}. Containment is the first condition for a formula being an I.E.S. of another 
formula. We now show equivalence and irredundancy. 

Since II' is equivalent to IT, we have II' \= I. By Lemma 5, II' is equivalent to lT\(n|Z)U{7}, 
which is indeed Il 2 . As a result, II 2 is equivalent to II', which is equivalent to IT, which is 
equivalent to I1\(II|/) U {/} by Lemma 5. 

Let us now assume that lT\(n|Z) U {/} is redundant. The clause / cannot be redundant 
because it is the only clause mentioning the literal /. Therefore, there exists a clause 7 not 
containing / such that 7 is redundant in IT\(n|/) U {/}. By Lemma 6, 7 is also redundant in 
IT, thus contradicting the assumption that IT is an i.e. s. □ 

The converse of this lemma does not hold. Even if n 2 is an i.e.s. of n\(II|/) U {/}, it is 
not necessarily true that an i.e. S. of II can be obtained by simply adding some clauses of Il|/ 
to it. Actually, even adding adding all clauses of II|/ does not necessarily lead to a formula 
that is equivalent to IT, as the following example shows. 

II = {x V Xi, x V x 2 , ~>Xi V y, ->x 2 V ->y, ->x V y} 
I = x 

n\(II|/)U{Z} = {-.xi Vy.^V^.nxVy.x} 

II 2 = {n^Vn^nlV?/,!} 

n 2 \{/}U(n|/) = {-1x2 V -y,->xVy,xV x u xV x 2 } 

It holds II |= x: this can be proved by adding ->x to II and using unit propagation: x\ 
and x 2 are derived, leading to y and -iy, respectively. It is also easy to prove that II 2 is 
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equivalent to II\(n|Z) U {/}: since ->x V y and x entail y, the clause ->Xi V y is entailed by 
IT 2 . The irredundancy of IT 2 is also easy to prove: removing a single clause from it, either 
x, y, or -1X2 cannot be derived any longer. Adding all clauses of n|Z to Il 2 , however, does 
not allows to derive x any longer. Indeed, n 2 \{/} U (n|Z) has the model ~^xxiX2^y, which 
assigns false to x. 

An analysis of this counterexample shows why the converse of Lemma 7 is false. The 
problem is with the clause ->x V y, which makes y true in the original formula. This clause 
allows to remove -1X1 V y, which was necessary in II to entail x. Without the clause ->x V y, 
indeed, the converse of Lemma 7 would hold for II and I = x. 

More precisely, a possible (but incorrect) proof of the converse of Lemma 7 would go by 
considering that X\ and x 2 should always entail y and ->y, respectively, in every I.E.S. of II, 
and therefore in any I.E.S. of II\(II|/) U {/}, because this formula is equivalent to II. What 
makes this proof fail on the counterexample above is that x\ — > y holds also because of x 
and -ix V y, and this proof relies on x, which is removed while "coming back" from II 2 to 

n\{/}u(n|/). 

As a result, the problem with this proof is that the clause ->Xi V y, which is part of the 
proof of x, is not necessary because it can be derived from I — x in Il\(n|/) U {/}. On the 
other hand, / can only derive clauses of the form I V something, or y V something where y 
is a consequence of I. In other words, what makes the proof fail is the possible entailment 
of clauses containing literals that are derivable from I. On the other hand, II |= I; therefore, 
II |= y. The counterexample therefore relies on the presence in I1\(I1|/) U {/} of clauses 
containing literals that are entailed by II. Replacing all such clauses with that literal would 
therefore invalidate the counterexample. Since IT^ is the set of literals entailed by II, the set 
nill^ contains all clauses of II containing a literal that is entailed by II. 

Lemma 8 Every clause of a 2CNF formula II either is in II|II^ or does not contain literals 
in LT^ or their negation. 

Proof. Let / G 11^. All clauses containing / are in ITIIT^ by definition. On the other hand, 
all clauses containing the negation of / are in the form -1/ V I'. Since IT |= /, we have II |= 
Therefore I V V e II|IT N . □ 

The following is an obvious consequence of the above lemma. 

Corollary 2 For every 2CNF formula IT, it holds that II\(IT|IT^) and LT^ do not share 
variables. 

This corollary is the base of the next result. Indeed, it shows that n\(II|n^) U IT^ is 
composed of two completely separated parts n\(n|II| = ) and LT^. The same therefore holds 
for any of its subsets, and in particular for all of its I.E.S. 's. We show the following lemma 
proving that replacing at once all clauses of ITIIT^ with IT^, the converse of Lemma 7 holds. 

Lemma 9 Let U be a consistent 2CNF formula. 7/II 2 is an I.E.S. o/n\(II|n^) UIT^, then 
n 2 \II^ U (n|IT| = ) is equivalent to II. 
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Proof. Since II and II\(II|II^)UII( = are equivalent and Il 2 is an i.E.s. of II\(II|II| = ) UII^, we 
have that IT 2 and II are equivalent. The claim is proved by showing that n 2 \II^ U (IT|IT| = ) 
entails IT| = . This would prove that IT 2 \n N U (II|IT| = ) is equivalent to II 2 \II| = U (n|n = ) U II| = , 
which is a superset of Il 2 and is therefore equivalent to II. 

Intuitively, the proof is as follows: if / G 11^, then there is a proof of I in II. This 
proof involves some clauses of n|II^ and some clauses in n\(II|IT^). On the other hand, 
everything that is entailed in II is also entailed in its equivalent formula n 2 \(n|II| = ) U IT^. 
Since n 2 \(n|II| = ) and IT^ are built over disjoint literals, every clause of II not containing 
literals that are entailed by II is derivable in I1 2 \(I1|I1| = ). 

Let us formally prove the claim. Let I G 11^, that is, IT |= /. We show that n 2 \II| = U 
(I1|I1^) |= I. Since II |= two conditions are possible: either I G II, or I G" II. In the first 
case, / G IT|IT^, and the claim is true because / is in n 2 \II^ U (II|n^). 

Let us now consider the case / G" II. Since II |= I and II is consistent, II U {->/} allows 
deriving a pair of opposite literals by unit propagation. Let the following be the chains of 
clauses used in the unit propagation from ->l to these pairs of opposite literals: 

Pi -> P2 -> ► Pn-l -> Pn 

ni — > n 2 — > ■ • • — > n n _i — > -m n 
where ->l — p\ — rt\ and n n = ->p n 

Consider an arbitrary link Zj — > of these two chains, corresponding to the clause 
-iZj V By Lemma 8, either this clause is in IT|n^ or it does not share variables with II^. 
Let us now consider this second case. 

Since this clause is in II, it holds Il 2 |= l { — > On the other hand, Il 2 = (IT 2 \n^) U 
(Il 2 fl 11^). These two parts of Il 2 contains disjoint literals because of Corollary 2. Since 
II 2 |= k — > Zj+i and neither /j, nor their negations are in IT^, then IT 2 \IT^ |= U — > 
because the other part of II 2 contains literals that are mentioned neither in IT 2 \n^ nor in 
h ~~ > h+l- 

Of the clauses of the two chains above, therefore, we have that a clause U — > U + \ is either 
in n|II^ or is entailed by II 2 \IT^. As a result, IT 2 \II^U (IT|IT^) entails all these clauses, and 
therefore the unit propagation from ->l leads to two pair of opposite literals in this formula. □ 

This lemma only proves that IT 2 \n^ U (IT|IT^) is equivalent to II, but does not prove it 
is an I.E.s. of II. In general, this is not true. However, we can show that an I.E.s. can be 
obtained from this set by removing only some clauses of IT | IT^ . 

Lemma 10 Let II be a consistent 2CNF formula. 7/n 2 is an I.E.S. o/II\(II|II| = ) LUI^, then 
there exists IL C II|II^ such that 111 U (n 2 \II^) is an I.E.s. ofH. 

Proof. Lemma 9 shows that adding n|II^ to IT 2 \II^ results in a set that is equivalent to 
II. We are now trying to prove that an I.E.S. of II can be obtained from I1 2 \II^ U (IT|II^) 
without removing any clause of IT 2 \IT^. What we actually prove is that no clause of IT 2 \IT^ 
is redundant in IT 2 \IT^ U (ITIIT^), thus proving that all I.E.S. 's of IT 2 \n^ U (II|II^) contain 
all clauses of IT 2 \IT^ by Lemma 1. 
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Let 7 G n 2 \n = . Assume that 7 is redundant in n 2 \ri| = U (n|II^), that is: 

(n 2 \n N u(n|n N ))\{ 7 }h7 

Since all clauses in IT|n^ contains a literal in LT^ by definition, we have that LT^ |= IT|n^. 
As a result, LT^ is logically stronger than II|n^, and the above formula therefore implies: 

(n 2 \n N un N )\{ 7 }|=7 

This is the same as n 2 \{7} |= 7, contradicting the assumption that Il 2 is irredundant. □ 

The converse of this lemma is an immediate consequence of a repeated application of 
Lemma 7. We can therefore conclude the following corollary. 

Corollary 3 n 2 is an i.E.s. o/TI\(n|n^) U LT^ if and only if there exists Tli C IT|n^ such 
that IIi U (Il 2 \n^) is an I.E.S. of II. 

2.3 Cyclicity and Induced Graphs 

The presence of cycles of clauses in a formula determines the complexity of some problems 
related to i.E.s.'s. Formally, cycles are defined as follows. 

Definition 4 (Simple Cycle of Binary Clauses) A cycle of binary clauses is a sequence 
of clauses [ — 1Z1 V Z 2 , — 1Z2 V h , • • • , ->l n V ^1] suc h that no literal U occur in more than two clauses. 

This definition only covers simple cycles of clauses, that is, we are not allowed to "cross" 
the same literal twice. A non-simple cycle of clauses can be defined as a sequence [-1/1 V 
lit ~~^2 V I3, . . . , -il n V li] in which there is no pair of indexes i,j with % 7^ j such that lj = — iZ^ . 
This definition is however not necessary because we only classify formulae based on whether 
they have cycles or not. Since every formula having cycles also have simple cycles (and, 
obviously, the other way around), the classification based on having or not having simple 
cycles is sufficient. 

The graph of a 2CNF formula induced by a literal is, roughly speaking, the graph of 
literals that can be derived from the given one by unit propagation. 

Definition 5 The graph induced by a literal I on a 2CNF formula II is the minimal graph 
such that: 

1. I is a node of the graph; 

2. if V is a node of the graph and V I" G II, then I" is a node of the graph and (/', I") 
is an edge of the graph. 

A property of acyclic formulae is that all its induced subgraphs are acyclic and vice versa. 
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Property 4 A 2CNF formula IT contains simple cycles if and only if some of its induced 
graphs contain cycles. 

Formulae not containing cycles have some interesting properties. First, consistent acyclic 
2CNF formulae not entailing any literal have a single i.E.s. This result is interesting also 
because it makes some problems related to i.E.s.'s computationally simpler. The second 
result about acyclic formulae is that, as far as two literals l\ and l 2 that are entailed by the 
formula are concerned, the choice of a minimal subset of clauses entailing Z x can be done 
independently of the choice for l 2 . These two results are proved in two later sections. 

2.4 Unit Clauses 

If IT only contains binary clauses, then II U {/} \=up I' has the only one possible meaning 
that I' is obtained from IT by applying unit propagation starting from I because I is the only 
unit clause of II U {/}. If this is the case, II U {/} \=up I' is equivalent to the reachability of 
/' from I in the graph of II induced by /. 

In most cases, however, II cannot be assumed to be composed of binary clauses only. In 
particular, even if we start from a formula made of binary clauses only, applying Corollary 1 
or Lemma 3 leads to formulae containing unit clauses. On the other hand, every unit clause 
I can be replaced with the logically equivalent pair {I V I', I V _| /'}, where V is a new variable 
not occurring in the rest of the formula. Since {/} = {I V I', I V the redundancy of / is 
equivalent to the redundancy of the pair {I V I', I V and most of the properties related 
to i.E.s. are also unaffected by this replacement. 

The only property that is changed by replacing I with {/ V /', I V ->/'} is about the size of 
i.E.s.'s of a formula, as we are replacing a single clause with a pair of clauses. This problem 
will be taken care by counting such a pair as if it were a single clause in the algorithms for 
checking the size of a minimal I.E.s. of a formula. 

In the rest of this paper, whenever we have to check whether II U {/i} \=up h or IT U 
{h} \=up -L, we assume that this transformation has been preliminary been done on II, so 
that II only contains binary clauses. This way, checking unit propagation can be done by 
looking at the graph of II induced by l\. In particular, II U \=up h means that, in the 
graph of II induced by h, there is a path from l\ to l 2 . 

By definition, II U {/i} \=up -L means that unit propagation from / allows reaching a pair 
of opposite literals l 2 and -il 2 . Graphically, there exists a path from li to l 2 and a path from 
l\ to -il 2 in the graph of II induced by l±. Let l 3 be the last common literal of these two 
paths. 
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Since — iZ 2 can be reached from / 3 , we have that ->l 3 can be reached from l 2 . As a result, 
there exists a path from l x to / 3 , from l 3 to l 2 , and from l 2 to -1/3. In other words, IlU{/i} \=up 
_L implies that there is a single path starting from ^ and that includes a pair of opposite 
literals. In the sequel, whenever we say "all clauses in a path from l\ to _L" , we mean a single 
path starting from l\ and ending with the first literal that is the opposite of another literal 
in the path. 



2.5 The Three Cases 

As reported in the Introduction, three cases are studied separately, both for the problem of 
redundancy checking and the problems about I.E.S.'s: 

1. the formula is inconsistent; 

2. the formula is consistent and implies some literals; 

3. the formula is consistent and does not imply literals. 

We can now explain why these three cases require a different analysis. Let us consider 
first the last case: a formula not implying any literal. By Lemma 3, entailment of a clause 
hold if and only if one of its literals can be derived by unit propagation from the negation of 
the other one. The same must therefore be true for all I.E.S.'s of the formula. As a result, 
the problem of redudancy and the problems about I.E.S.'s can be reformulated in terms 
of formulae whose induced graphs have the same reachability relation of the graphs of the 
original formula. 




formula an I.E.S. 



This requirement, however, is too restrictive when considering literals that are implied by 
the formula. For example, if l 2 is reachable from l\ in the original formula but -1/1 is entailed 
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by the formula, this reachability condition is not necessarily true in all i.E.s.'s of the formula. 
Indeed, all that is needed is that unit propagation from li allows reaching contradiction; the 
condition that I2 is reachable from l\ is not necessarily true in all I.E.s.'s of the formula. 



formula 



■± 




an I.E.S. 



■± 



Finally, if a formula is inconsistent then its I.E.s.'s are only required to be inconsistent. 
Not only these I.E.s.'s are no longer required to have the same reachability relation of the 
original formula: they can even omit to mention some literals at all. Indeed, a subset of 
an inconsistent formula can be inconsistent even if it does not mention some literals of the 
original formula. 

xQ -± xQ -± 




formula an I.E.S. 

Summarizing, the three cases are studied separately because of the different requirement 
on equivalent subsets: in the first case, reachability using unit propagation is the same in 
the I.E.s.'s and in the formula; in the second case, only reachability of _L from the negation 
of implied literals is the same; in the third case, only the reachability of _L from an arbitrary 
pair of opposite literals is the same. 

2.6 Number of Clauses 

We show a lemma about clauses containing a literal that is entailed by the formula. 

Lemma 11 Every consistent 2CNF formula II such that II |= I and containing three or 
more clauses containing I has at least a redundant clause containing I. 
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Proof. If Z G II, any other clause containing / is redundant. Let us assume Z ^ II. Since II is 
consistent, II |= / is equivalent to ITU {~<l} \=up -L. By definition, this formula is true if and 
only if -<l allows deriving a pair of complementary literals x and ->x by unit propagation. 

Let us first assume that neither / is neither x nor ->x. Consider the sequences of clauses 
used in the derivation of x and ->x from Z. Without loss of generality, we can assume that Z 
is only contained in the first clauses of these two sequences. 

Removing all clauses containing Z but the first clause in each of these two sequences, we 
obtain a formula that still entails /, and therefore allows deriving all clauses that have been 
removed, contradicting the assumption that II is irredundant. 

A similar proof can be used for the case in which I = x or I = ->x. In this case, the first 
clause of the sequence of clauses used in the derivation of -1/ from Z allows deriving all other 
clauses containing Z. □ 

An obvious consequence of this lemma is that every consistent and irredundant 2CNF 
formula implying a literal contains at most two clauses containing that literal. 

2.7 Acyclic Consistent 2CNF Formulae not Implying Literals Have 
a Single I.E.S. 

We show that every acyclic consistent 2 CNF formula not implying single literals has a single 
I.E.S. In this section, we assume that II is a 2CNF formula that is consistent, acyclic, and 
it does not imply any single literal. Since II is consistent and II ^ / for every literal /, we 
have that n )= — iZ V Z' holds if and only if II U {/} \=up I'- The same holds for all its subsets 
and, in particular, for all its i.E.s.'s. 

Lemma 12 //II U {/} \=up I' then ITU{/} \=up V holds for every I.E.S. IT of the consistent 
acyclic 2 CNF formula II. 

Proof. Since IIU{7} \=up I' it holds II |= ->l VZ', and the same therefore holds for II' because 
this formula is equivalent to II. On the other hand, II' is consistent and does not entail 
literals because so is II. As a result, n' |= — iZ V I' is equivalent to II' U {/} \=up I'- □ 

Every literal / partitions II into the set of clauses that are involved in the first step of 
unit propagation from / and the other ones: 



The clauses in -Dn(0 are those used in the first step of unit propagation from /. The 
following set C n (0 is the set of literals that would result from this propagation. 



Since all clauses ->l V I' in Dn(l) are also in IT, they are entailed by every I.E.S. II' of II. 
Since II is a consistent CNF, so are all its subsets, and II' in particular. Since II' entails — iZ VZ' 



Ai(0 

Rn(l) 



{ 7 g IT I 7 = -.Z V I'} 
U\D u (l) 



c u (i) = {l' I -.z v l' e n} = {V I ^z v z' e D n (i)} 
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and IT 7 is consistent and not entailing literals, I' is reachable from I in the graph induced by 
I on IT. 

In turn, a given I' G Cn(/) is reachable from I if and only if either ->l V I' G II', or 
there is another literal I" such that ->l V I" G IT and I' is reachable from I" using the edges 
corresponding to the clauses of Ru(l)- Consider the following graph induced by / on a 2CNF 
formula. 




In this example, -1/ V I4 cannot be removed from II because it is the only clause allowing 
U to be reached from I. The same holds for — iZ VZi. The clauses — iZ V Z 2 and — V Z 3 are instead 
redundant because l 2 and l 3 can be reached from I by following the edge corresponding to 
the clause -1/ V l\ and then following some edges corresponding to clauses in Ru(l)- 

This example shows that the irredundant clauses are those containing the literals of Cn(7) 
that cannot be reached from other literals of Cn(/). Such literals necessarily exist because 
the formula (and therefore all its induced graphs) are acyclic. 

M n (Z) = {V G C u (l) I fil" G C n (0 such that R u (l) U {/"} ^ UP I'} 

Formally, Mn(Z) is the set of literals that cannot be reached from other literals of Cn(0 
using unit propagation on the clauses Rn(l)- In the example above, Ru(l) and Mn(7) are as 
follows: 
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The nodes that cannot be reached from from other nodes are l\ and Z 4 . Therefore, 
Mn(l) = {h,U}- The next two lemmas formally prove that Mn(/) exactly characterizes the 
clauses containing -1/ that are in some i.E.S.'s. 

Lemma 13 If U is a consistent acyclic 2CNF formula not implying literals and l\ G Mn(/) 
then any I.E.S. of II contains -il V l±. 

Proof. Let us assume that II' is an I.E.S. of II. Since II U {/} \=up h, we have that 
II' U {/} \=up h by Lemma 12. Let us assume that ->l V l\ is not in II'. Then, the path from 
/ to l\ is made of an edge corresponding to a clause -1/ V I2 with l 2 7^ h followed by a path 
from l 2 to l\. This path cannot include / because otherwise II would be cyclic. Therefore, 
this path from l 2 to l\ is all contained in Rn(l). This is however in contradiction with Zi 
being in M u (l). □ 

The second lemma is the converse of the previous one, stating that literals not in Cn(l) 
do not form clauses with I that are in any I.E.S. 

Lemma 14 If li G Cn(/)\M n (/) then no I.E.S. of the consistent acyclic 2CNF formula II 
contains -il V l±. 

Proof. By definition of M n (/), if a literal h is in Cu(l) but not in M n (/), then there is a 
literal I' in M n (l) such that Ru{l) contains a path from I' to l-y. Let IT be a I.E.S. of II. Since 
li is reachable from /', it holds LT U {~<l'} \=up h an d therefore IT U {~<l'} \=up h- 

By the previous lemma, LT' contains ->l V /'. As a result, LT' U {/} \=up I'- Since LT' U 
{-1/'} \=up h, we have that l\ can be reached from I by first using the clause -1/ V/'. Since the 
formula is acyclic, the clauses used in the derivation II' U {^l'} \=up h cannot contain /. As 
a result, unit propagation allows to reach l x from / without using the clause ->l \fl±. In other 
words, II'\{-i/ V li} U {/} \=up h, thus proving that -1/ V l\ is redundant in II', contradicting 
the assumption that LT' is an I.E.S. □ 

We can then conclude that Mu(l) exactly identifies all clauses of Du(l) that are in an 
I.E.S. of n. 

Corollary 4 A clause -1/ V l\ is in an I.E.S. of the consistent acyclic 2 CNF formula IT if 
and only if li G Mn(i). 

Once l\ V l 2 is proved to be in an I.E.S. because l 2 G Mn(~i/i), we do not need to also 
check li G Mn(-<1 2 ). The same holds if l±\/l 2 is proved not to be in an I.E.S. in the same way. 
This result tells that every 2CNF consistent acyclic formula that does not imply literals has 
a single I.E.S. 

Theorem 1 Every consistent acyclic 2CNF formula II not implying literals has a single 
I.E.S. 

Proof. For each clause l\ V l 2 , check whether l 2 G Mn(-<li). If this is true, then l\ V l 2 is in 
all i.E.S.'s, otherwise it is in no I.E.S. Therefore, the set composed of all clauses l\ V l 2 such 
that l 2 G Mn(-iZi) is the single I.E.S. of II. □ 
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2.8 Implied Literals not in a Cycle of a Consistent 2 CNF Formula 



We consider the clauses containing literals that are entailed by the formula but are not in a 
cycle of clauses of the formula. We show that, if /' and I" are two such literals, then we can 
independently choose among clauses containing V and from clauses containing I" to form an 
i.e.s. In other words, an i.E.s. can be obtained by choosing a subset of clauses containing V 
and a subset of clauses containing I", and these choices are independent from each other. 

By Lemma 2, since II is consistent, II |= ->l holds if and only if II U {1} \=up -L. Using 
the transformation of Section 2.4, we can assume that II does not contain unary clauses, and 
therefore II U {/} \=up -L means that we can reach a pair of opposite literals by propagating 
/ in II. We partition the clauses of II in those containing ->l and those which does not, and 
define the set of literals that are direct consequences of Z. 

D u (l) = { 7 GIT | 7 = -.ZVZ'} 

R u (l) = U\D n (l) 

C n (Z) = IT | -.Z v V e n} 

Since II |= ->Z, every equivalent subset of II allows reaching a pair of opposite literals 
from I. This is possible if and only if either one of the two following conditions is true: 

1. -.Z V Zi G n and II U {k} \= UP _L; 

2. -iZ V I2 G II, II U {I2} \=up which is equivalent to: 

nivi 2 en,nu {i 2 } \=up ^h, and i 3 v ^1 g n. 

If IT is an I.E.s. of II, these conditions hold for II if and if they hold for IT. In addition, if 
II does not contain a cycle including Z, the same hold for II'. As a result, the unit propagation 
from Zi to _L or from l 2 to — 1Z3 cannot include Z, as otherwise Z would be part of a cycle. We 
show a skecth of the proof on the formula represented by the following figure. 
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All paths from Zi to _L and from Z 2 to -1/3 are entirely contained in i?n(0 : otherwise, / 
would be part of a cycle. As a result, the same derivations are possible in an i.E.s. II' of 
IT if and only if they are possible using only clauses of Rn(l), that is, they are possible in 
IT' n R n (l). In other words, IT U {h} ^ UP JL if and only if (IT n R u (l)) U {Z 2 } \=up -L, and 
the same for the derivation of -1/3 from Z 2 . As a result, II' fl Rn(l) always entails Z with the 
addition of either ->l V l\ or the pair ->l V Z 2 and — iZ V Z3. 

Let us formally define the literals like l\ and the pairs of literals like Z 2 and Z3. 



In words, Sn(Z) is the set of literals we can reach contradiction from using unit propagation 
in Ru(l) while Pu(l) is composed of the pair of literals such that the negation of one is 
reachable from the other in Ru(l)- 

Lemma 15 //IT is an I.E.s. of the consistent acyclic 2CNF formula IT such that IT |= ->l 
and I is not in any cycle of clauses of II, then the following are all I.E.S. 's ofU: 

1. (II' n Rn(l)) U H V h} with h G S u (l); 

2. (n'ni2 n (0) u Hvi 2) -.i vz 3 } with (z 2 ,z 3 ) e P n (0- 

Proof. IT fl i?n(0 contains all clauses of II' but those containing As a result, if we can 
prove that the formulae above entail -1/, that would prove that they are equivalent to IT'. 

Let us consider (IT'ni?n(0)U{ _, ^V/i} first. Since l\ G Su(l), we have that ITU{/i} \=up -L. 
Since Z is not part of a cycle, all derivations of _L from l\ are enterely contained in Ru(l)- 
Since IT' is equivalent to IT, it holds IT' U \= _L; since IT' is a subset of IT, all derivations 
of _L from li only use clauses of Rn(l)- As a result, (IT' fl i?n(0) u {h} l = -L, which implies 
that (n' n R n (l)) U H V Zi} U {/} |= J_. 

Since IT' is irredundant, IT' fl i?n(0 is irredundant as well. In order to prove that (IT' fl 
i?n(0) U I -1 / V Zi} is irredundant, observe that ->l VZi is not redundant because n'ni?n(0 do 
not contain clauses containing ->l and therefore cannot entail Regarding the clauses of 
n'ni?n(0) since they do not contain ->l by definition, Lemma 6 applies: they are redundant 
in (n'ni?n(0) U{->Z VZx} if and only if they are redundant in U'r\Rn(l), which is impossible 
because IT' is irredundant. 

The proof for (IT' fl i?n(0) U l -1 ^ V Z 2 , ->l V Z3} with (Z 2 , Z3) G S'n(Z) is similar. This formula 
implies ->l because — 1Z3 is reachable from Z 2 , and all paths from Z 2 to — 1Z3 are in Ru(l)- Since 
IT is equivalent to IT, it contains one such path, that is therefore all contained in IT' ni?n(0- 
As a result, the addition of the clauses ->l V Z 2 and -iZ V Z 3 allows the entailment of Z. 

The proof of irredundancy of (IT' fl i?n(0) U l -1 ^ V ^2, "'^ V ^3} is also similar to the previous 
one. However, for this proof to work we also need the fact that neither Z 2 nor Z3 are in S'n(Z), 
and therefore one of them is not sufficient for entailing I. □ 

This lemma shows a simple way for determining an I.E.S. of an acyclic consistent formula: 
for each literal Z such that IT \= -iZ, we choose either a clause Z V Zi with Z x G S'n(Z) or a pair 
of clauses Z V Z 2 and Z V Z 3 with (Z 2 , Z 3 ) G Pn(0- By the above lemma, we can do this choice 
for all clauses containing a literal that is entailed by the formula. 



Su(l) 

Mi) 



{h G Cn(Z) I R u (l) U {ZJ hc/p ^} 

{(/ 2 ,/ 3 ) e C n (Z) I R u (l) U {Z 2 } hc/p --h}\Sn(!) 
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3 Redundancy Checking 



In this section, we consider the problem of checking whether a 2CNF formula is redundant. 
The trivial algorithm of checking whether IT\{7} |= 7 for each 7 e II takes time 0(mm), 
where m is the number of clauses in II. We improve over this result by showing algorithms 
that solve the problem in time 0(nm), where n is the number of variables, in all three 
possible cases (inconsistent formulae, formulae implying literals or not.) Cyclicity does not 
affect the problem of checking redundancy. Here is a summary of the results in the various 
cases. 

The formula is inconsistent. If a set is both inconsistent and irredundant, the number 
of its clauses is at most four times the number of variables. Therefore, if the number of 
clauses is greater, the set is redundant. If it is lower, we still have to check redundancy, 
but the running time 0{mm) of the trivial algorithm is now the same as 0{nm). 

The formula is consistent. For each literal I, we proceed differently depending on whether 
IT |= l or not. 

The formula implies the literal. If II \= I then either II is inconsistent or II \=r I. 
Since II is by assumption consistent, we have II /. By Lemma 2, we have 
IT U {-1/} \=up -L- We consider two cases separately. 

I is in three or more clauses. Formula II is redundant by Lemma 11. 

/ is in one or two clauses. By assumption II U {~<l} \=up -L- hi order to check 
redundancy, just remove any of the two clauses containing / and check whether 
UP still leads to _L from I. Two UP derivations, which are linear in the number 
of clauses, are needed for the literal I. 

The formula does not imply the literal. Since IT ^ I we have n\{-i/V/'} \= 

if and only if TI\{ — iZ V I'} U {/} \=up I'- Therefore, we can consider the graph 
induced by the unit propagation of / in IT, and check whether I' is reachable from 
I without using the edge corresponding to -1/ V V . This test can be done in linear 
time by a modified algorithm of graph reachability. 

The exact description of algorithms, and the proofs of their correctness, are given in the 
following three sections. 

3.1 Redundancy Checking: Inconsistent 2CNF Formulae 

Lemma 4 shows that every inconsistent 2CNF formula contains some clauses allowing both 
x and -ix to derive _L by unit propagation. We show that the number of such clauses 
is necessarily linear in the number of variables, thus proving that every inconsistent 2CNF 
formula having a number of clauses that is not linear in the number of variables is redundant. 



19 



Lemma 16 A 2CNF formula IT is inconsistent and irredundant if and only if it is composed 
of two simple chains of clauses like the following ones: 

xv/i, -hZiW 2 , ••• , l m vy, -■yVsi, ... , s m y^y 

-tfVpi, ^PiVj9 2 , ... , p m V z, -.zVgi, ... , g m V^ 

Proof. If II is inconsistent, by Lemma 4 there exist a variable x such that II U {x} \=up -L 
and II U {— |=[/p _L. In turns, II U {x} \=up -L implies the existence of a cycle-less chain 
allowing the derivation of y and -iy from x by unit propagation, as explained in Section 2.4. 
The same holds for II U {->x} \=up -L. The clauses of these two chains imply inconsistency. 
Therefore, if II contains other clauses, they are redundant. □ 

This lemma shows that every inconsistent and irredundant set of clauses is composed 
exactly of two chains of clauses, each one not containing the same literal twice. The length 
of each such chain is at most the number of literals; therefore, the number of clauses of 
such formula is at most two times the number of literals. Therefore, if an inconsistent 
2CNF formula contains a number of clauses that is greater than four times the number of 
its variables, it is redundant. If it contains less clauses, 0(nm) and 0(mm) are the same. 
Therefore, checking the consistency of IT\7 for each 7 e II has complexity 0(nm). 

3.2 Redundancy Checking: Consistent 2CNF Formulae Implying 
Literals 

We study the problem of checking redundancy of a set of clauses in which some literals are 
implied. We show that we can check the redundancy of all clauses / V V such that II |= I 
in linear time. In other words, time 0(m) is required for every literal I that is implied by 
II. The redundancy of the other clauses can be then checking by verifying the redundancy 
n\(II|/) U {/} by Lemma 1. After this check has been done for all literals that are entailed by 
II, we obtain a the formula IT\(II|n^) U II |= whose parts do not share variables by Lemma 2 
and whose first part IT\(n|n^) do not entail any literal. The redundancy of the first part 
can therefore be checked using the algorithm of the next section. 

Let / be a literal such that II |= I. By Lemma 2, we have II U {~<l} \=up -L, which means 
that unit propagation from -1/ derives both a literal and its negation. In other words, there 
exists a variable x such that II U {-1/} \=up x and II U {-1/} \=up ~>x- By definition, there 
are then two acyclic paths in the graph of IT induced by -1/, one from ->l to x and one from 
-1/ to -ix. The first clause of these two paths are the only clauses that are necessary to allow 
the derivation of x and ->x from -1/. Regardless of whether these two clauses are the same 
or not, they are the only clauses containing / that are necessary to prove II U \=up -L. 
As a result, if I is contained in more than two clauses of II, this formula is redundant. 

In order to check the redundancy of clauses I V /' such that II \= I, we first check whether 
the number of such clauses is greater than two. If this is the case, the set is redundant. 
Otherwise, we check the redundancy of the clauses I V I' by simply performing the linear- 
time entailment check H\{1 V /'} |= / V I' for all such clauses / V I'. Since there are are most 
two such clauses, this test only requires linear time. This test is repeated for all literals / 
such that II |= /; therefore, the total running time is 0(nm). 
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3.3 Redundancy Checking: Consistent 2CNF Formulae not Im- 
plying Literals 

We show that the redundancy of 2CNF consistent formulae not implying literals can be 
checked in time 0(nm), where n is the number of literals and m is the number of clauses of 
the formula. 

Let II be a consistent 2CNF formula not implying any literal, and let -1/ V V be one of 
its clauses. Lemma 3 can be simplified thanks to the assumption that no literal is implied: 
n\{^ V l'} |= — ./ V /' holds if and only if II\{-.Z V I'} U {/} ^ UP I'. Indeed, neither -.Z nor V 
are implied by II, so they cannot be implied by iZ V I'} either. 

We therefore only have to check whether n\{ — iZ V I'} U {/} \=up I', which can be done 
by checking whether the graph induced by Z on II contains a path from Z to Z' that does not 
contain the edge corresponding to the clause ->l V I'. We now show that this check can be 
done for all clauses containing ->l at the same time in 0(m). Let Cn(Z) be the following set 
of literals. 

Cn(Z) = {I' I ^Z V Z' G n} 

Redundancy of a clause ->l V I' is equivalent to the existence of a path in the graph of II 
induced by Z from another literal in Cn(Z) to V . In general, if there is a path from a node 
in Cn(Z) to another node in Cn(Z) not containing Z, the formula is redundant. For example, 
the following formula is redundant, as we can delete the edge Z — > Z", and still I" is reachable 
from Z. 




C n (0 



The first step of the algorithm is to remove Z and all its incident edges from the graph. 
The second step is that of checking the existence of a path from Cn(Z) to Cn(Z) in the 
resulting graph. Note that no pair of opposite literals can be reached from Z because II does 
not entail the literal -il. 

A variant of the algorithm for node reachability can be used for doing this check while 
visiting the graph only once. The original algorithm for reachability is as follows. 
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1. the starting node is marked while all other nodes are marked oo; 



2. at each step, take the set N H of nodes that are marked with the highest integer; let i 
be this integer; 

(a) for each n G Nh, consider any of its successors m; 

(b) mark each m with the minimum among i + 1 and its previous marker. 

3. if no label has been changed during Step 2, stop. 

The idea is that the label of the node is the distance from the starting node to it. By 
visiting the graph width-first, we are considering an edge at most once in the whole process. 
This is why the algorithm is linear. 

This algorithm can be applied to the problem of redundancy if the graph is acyclic: start 
with the nodes in Cu{l) (instead of a single node), and visit the graph until a node in Cu(l) 
is reached. In other words, if we reach a situation in which the successor m of a node n G Nh 
is in Cn(l), then the set of clauses is redundant. 

The algorithm does not work for cyclic graphs: the formula is redundant if a node in 
Cu(l) can be reached from another node in Cu(l)- On the contrary, if a node in Cu(l) can 
be reached from itself, the formula is not necessarily redundant. More precisely, a cycle of 
this kind does not prove redundancy. In the following example, a node in Cn(l) is reachable 
from a node in Cn(l), but the path is a cycle; clearly, the set is not redundant. 




C n (0 



The algorithm must be modified in such a way it checks whether a node of Cn(i) can be 
reached from another node of Cu(l)- This can be done by marking each node we visit not 
only with its distance from Cu(l), but also with the nodes of Cu(l) it is reachable from. 

This variant of the node reachability algorithm is however not linear because the set of 
nodes of Cn(l) a node is reachable from may grow to contain all nodes. However, when a 
node is reachable from two or more different nodes of Cn(l), we do not have to care about 
the node we started from any longer. Indeed, if a node /' is reachable from two (or more) 
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nodes I1J2 £ Cn(0> and a node of Z3 G Cu{l) is reachable from then either l\ or Z 2 must 
be different from l 3 . As a result, the graph contains a path from l x or l 2 to Z3 and at least 
one between li and I2 is different from Z 3 . As a result, once a node I is known to be reachable 
from two different nodes of Cn(/), we only need to check whether a node of Cu(l) can be 
reached from I. 

The algorithm is as follows. In a first phase, labels of nodes can be either (n,i), where 
n G Cn(l) and i is an integer, or the special mark two. 

1. each node of n G C(l) is marked with (0, n); take N = C(l); set % — 0; 

2. set A" i+ i = 0; 

3. let m G JVj, and let (i,n) be its marker; for any of its successors t: 

(a) if t is not marked, mark it with (i+ 1, n), and put t in N i+ i, 

(b) if t is marked with (j,n), it must be j < i; do not change its marker; 

(c) if t is marked with (j, s), with n^s, mark it with two; 

(d) if t is marked with two, do not change its mark. 

4. if N i+ i is empty, stop; otherwise, set % — % + 1 and go to step 2. 

This is almost the usual visit of the graph width-first. The point is that we mark the 
nodes not only with their distance from Cn(/), but also with the node they can be reached 
from. Whenever a node is found out to be reachable from two nodes of Cn(/), we mark it 
with two and do not continue the search from it. If a node m G C n (Z) is the successor of 
a node marked with (n,i), and n 7^ m, then the graph is redundant. Note that the whole 
algorithm is linear, as each edge is at most traversed once. 

We now have to visit the successors of the nodes we have marked with two, and check 
whether any node of Cu(l) can be reached from them. This can be done with the very same 
original reachability algorithm, which is still linear in time. 

This algorithm determines the redundancy of all clauses containing one literal in time 
0{m). Therefore, all clauses can be checked in time 0{nm). 

4 Irredundant Equivalent Subsets (i.E.S.'s) 

The following problems about I.E.S.'s are polynomial for 2CNF formulae because of the 
polynomiality of entailment for this restriction. 

check whether a formula is an i.E.s. of another formula. Check containment and ir- 
redundancy; 

check whether a clause is in all I.E.S.'s. A clause 7 is in all I.E.S.'s of a formula II if 
and only if n\{ 7 } ^ 7 [Lib05]; 
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uniqueness. A formula II has an unique I.E.S. if and only if {7 e II ] IT\{7} ^ 7} |= IT 
[Lib05]. 

Two other problems require a more detailed analysis: the size of I.E. s. (checking whether 
a formula has an i.e. S. of size bounded by a number k) and the presence in an I.E. s. (checking 
whether a given clause is present in some I.E.S. of a given formula). The presence of cycles 
of clauses in the formula mostly determine the complexity of these two problems. 

As it is clear from the two tables below, the complexity (at least in the cases that have 
been successfully classified) does not depend on whether the formula is consistent or implies 
literals. However, the proofs are different in the various cases. In the two tables below, 
"single" indicates formulae implying a single literal and "nonsingle" indicates formulae not 
implying any literal. 

Size of I.E.S. 





inconsistent 


single 


nonsingle 


acyclic 


P 


P 


P 


cyclic 


?? 


NP-hard 


NP-hard 




Presence 


in a I.E.S. 






inconsistent 


single 


nonsingle 


acyclic 


P 


P 


P 


cyclic 


NP-hard 


NP-hard 


NP-hard 



We use the following order of the cases: first, all acyclic cases, then all the cyclic cases; in 
each case, we first consider inconsistent formulae, then consistent formulae implying literals, 
and then consistent formulae not implying literals. 

The following two results about consistent acyclic 2CNF formulae have already been 
proved. 

• Acyclic consistent formulae not implying single literals always have a unique I.E.S. 

• Bulding an I.E.S. for an acyclic consistent formula implying the literals — 1Z1 , . . . , -<l n 
can be done by choosing independently some clauses containing -1/1, some clauses 
containing -i/ 2 , etc. 

We show that these two fact allow proving proving proving proving proving proving prov- 
ing proving proving that, for acyclic consistent 2CNF formulae, one can determine presence 
and size of I.E.S. 's in polynomial time. 

Theorem 2 The unique I.E.S. of a consistent acyclic 2 CNF formula not implying literals 
can be found in polynomial time. 
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Proof. Let II be a consistent acyclic 2CNF formula. Since it has a single I.E.S., a clause 
is in its i.E.s. if and only if it is in all its i.E.s.'s. Since checking presence can be done by 
checking IT\{7} |= 7, and this inference is linear-time for 2CNF formulae, we can conclude 
that checking the presence of each clause in the unique I.E.s. of the formula can be done in 
linear time. □ 

The proof of the theorem shows a quadratic algorithm for generating the single I.E.s. of 
II based only one the uniqueness of such an I.E.s. However, Lemma 4 allows for a slightly 
better algorithm: for each /, can check all clauses ->l Vli at once by visiting the graph Ru(l)- 
Since the construction and visit of Ru(l) takes linear time, the overall running time is 0(nm), 
where n is the number of variables and m is the number of clauses. 

Theorem 3 The problems of presence of a clause in an I.E.s. and of existence of an I.E.s. 
of given size are polynomial-time for acyclic consistent 2CNF formulae. 

Proof. By Lemma 15, the presence of a clause / V/' with II |= -1/ in an I.E.s. only depends on 
whether /' G Sn(l) or there exists I" such that G Pn(l)- Since these sets Sn(/) can be 

checked in polynomial time by definition, checking the presence of a clause I V I' in an I.E.S. 
is a polynomial-time problem if II |= All other clauses can be checked in polynomial 
time thanks to Corollary 3 and Theorem 2. 

One of the smallest I.E.s.'s of an acyclic formula can be built in a similar way but choosing 
-1/ or a clause -1/ V I' such that I' G Sn(0 if possible, and a pair of clauses -<l V I' and ->l V I" 
with (/', I") G Pn{l) otherwise. To keep into account the pair of clauses ->l V I' and -1/ V 
that replace the unit clause I due to the transformation of Section 2.4, we count such a pair 
as it were a single clause of -1/ V I' with I' G 5n(7). □ 



5 Size of i.E.s. 

In this section, we consider the problem of checking whether a formula has an I.E.S. of 
size bounded by a given integer k. This problem has already been proved polynomial for 
consistent acyclic 2CNF formulae. The remaining cases are: 

• acyclic inconsistent 2CNF formulae; 

• cyclic 2CNF formulae, in all three cases. 

5.1 Size of I.E.S.: Acyclic Inconsistent 2CNF Formulae 

Let I! be an inconsistent and acyclic 2CNF formula. By Lemma 4, II is inconsistent if and 
only if there exists a variable x such that IlU{:r} \=up -L and IlU{-i:r} \=up -L. As explained 
in Section 2.4, II U {x} \=up -L implies that there exists a path from x to l\ and from l\ to 
— 1Z1 . For the same reason, II U {^x} \=up -L implies that there is a path from ->x to I2 and 
from l 2 to -i/ 2 . These paths can share nodes. We consider the various possible cases. 
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Case 1: disjoint paths. 
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Case 2: a common part "before the inconsistency". 



X 

Vi 

Case 3: a common part "at the inconsistency". 
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This formula is cyclic: we can go from ->x to -1/1 and from to ->x by inverting the path 
from x to l\. 
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This formula is cyclic: we can go from x to -i/ 2 , and from ->Z 2 to x by inverting the path 
from -ix to Z 2 - 

We can check the number of edges needed to reach each literal from each other: we build 
a table containing the number of clauses needed to go from any literal to any other one, if 
possible. We can then consider each case separately, and calculate how many clauses are 
necessary to form one of the patterns above. 

For example, for checking case 1, we consider all possible triple of literals (x,li,l 2 ), and 
sum up the distance from x to l±, the distance from li to — 1Z1 , etc. For each triple, we obtain 
the number of clauses needed to reach inconsistency according to Case 1. This procedure is 
done also for the other cases, and the minimal number is selected. 

This algorithm is correct if the formula does not contain cycles. If the formula contains 
cycles, the algorithm is incorrect because it does not take into account that two edges in the 
above graphs can correspond to the same clause and should therefore be counter once, not 
twice. 

We prove that this is not possible if the formula is acyclic. Let a V b be a clause. Its 
associated edges are -<a — > b and ->b — > a. This clause is "counted twice" if: 

1. the same edge occur twice in the same path: if ->a — > b occurs twice in the same path, 
we have a cycle from ->a to -ia; 

2. the same edge occur once in both paths but not in their common part (if any): if 
->a — > b occurs both in the path from x and in the path from -ix, we are in the 
situation in which ->a is reachable from both x and ->x and from ->a we can reach a 
pair of opposite literals; the paths in which what follows ->a is only counted once has 
been then considered as a Case 2 or Case 3 in which ->a is the point where the paths 
join; 

3. the edge ->a — > b and ->b — > a occur in the same path: we can remove the last edge and 
what follows because contradiction has already been reached; 

4. the edge ->a — > b and -16 — > a occur in two different paths: for example, if the first 
clause occurs in the path from x and the second in the path from ->x, we can go from 
x to -ia to b, and then from -16 to ->x by reversing the path from ->x to b. 

We can conclude that, if we check all possible cases, we always end up with the minimal 
ones in which each edge corresponds to a unique clause. Since checking distances in graphs 
is linear, we can prove the following theorem. 
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Theorem 4 Checking whether an acyclic inconsistent 2CNF formula has an inconsistent 
subset of size less that or equal to k is polynomial. 

Proof. We determine the minimal number of edges that are necessary to form any of the 
four combinations above. We have proved that any way of reaching inconsistency can be 
recast as one of them with no repeated clause. Therefore, the minimal number of edges of 
any case is the minimal number of clauses needed to form an inconsistency. □ 

In order to keep into account the fact that a pair of clauses /V/' and Z V — >Z' might represent 
the same original unit clause I because of the transformation of Section 2.4, we count 1/2 
instead of 1 the edges that have been introduced by this transformation. 

5.2 Size of I.E.S.: Cyclic Consistent 2CNF Formulae Implying Lit- 
erals 

We show that Lemma 15 does not hold if the literal I under consideration is in a cycle of 
clauses. Consider the following formula: 

n = {-.z' v z, -.z v V, v i", -.z' v x, -.z' v ^i" v y , -.z" v -y} 

The graph of II induced by Z is the following one, showing a simple cycle between the 
literals Z and I'. 




I" 

For this formula, Sn(l) = {/',/"}, while Pu(l) = 0- According to Lemma 15, if II' is an 
i.e.s. of II, then we can replace its clauses of Du(l) with a single clause Sn(l) and what 
results is still an I.E. s. of IT. 

We show a counterexample. The formula II is equivalent to {->Z, ->/', -</"}. The following 
is an I.E.S. of IT. 

II' = {V V l, -^l V l", V V y, -^l" V -^y} 
Graphically, IT is the following subset of the formula: 
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This subset is equivalent to II because it entails all three literals I, I', and I". Indeed, I" 
is entailed by the clauses V y and V ->y, while I and /' are entailed thanks to V I 
and ->l V I", which make I" reachable by unit propagation from I and I'. 

According to Lemma 15, we should be able to obtain another I.E.S. from IT by replacing 
all clauses of D u (l) with ->l V V in it, because V 6 S u (l)- Let II" = U'\D n (l) = V /, -i" V 
y, -1/" V If Lemma 15 were true for cyclic formulae, it would be that IT' U {-1/ V I'} is 
another I.E.S. This is however false, as this set does not imply neither -1/ nor Graphically, 
IT" U {-1/ V /'} is the following formula: 




I" 

This formula still entails ->/", but it does no more entail neither -1/ nor These are 
indeed the two literals involved in the only cycle of this formula. Intutively, Lemma 15 does 
not hold for cyclic formulae because, while IT U {/'} \=up -L still hold in any I.E.S. II' of II 
for every V G Su{l), the unit derivation of _L from I' might involve the literal /. Therefore, 
the choice of clauses containing I cannot be done independently of the choices of the other 
clauses. 

We now consider a generalization of the example above: we are given a set of literals 
such that: 

1. their negations are all entailed by II; 
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2. each literal in the set is reachable from any other one. 

We study the problem of finding a minimal set of clauses containing these literals and 
that are part of an i.E.s. Given any literal / such that II |= -il, its induced graph is composed 
of a strongly connected component including /, joined to the rest of the graph. Regarding 
the rest of the graph, we are only interested in the nodes that are joined to a node of the 
component. 

For every literal /, we define CC n (i) as the following set of literals. 

NOTATION: CCn(i) — {V \l and /' are in a cycle in the graph induced by I on 11} 

Since II |= I 1 for every /' e CCu(l), the same holds in every I.E.s. of II. Assuming that II 
contains no unary clause as discussed in Section 2.4, the graph of II induced by I is composed 
of a strongly connected component made of the nodes of CCu(l), and other nodes and edges. 
Since II implies all literals of CCu(l), this graph also contains paths from nodes of CCn(l) 
to pairs of opposite literals. These literals cannot both be part of CCn(l) as otherwise II 
would be inconsistent. 



This figure shows the three possibilities: either from a node of CCu(l) we can derive _L, 
or from two nodes of CCu{l) we can reach a pair of inconsistent literals. The nodes that are 
reachable with a single edge from CCn(i) will be denoted by JCn(/): 
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Notation: 



JC n (l) = {I' | 3/ G CCn(Z) and -.Z V I' G IT} 



The idea is that we do not need to care about "what happens outside CCn(Z)": if J- or 
-1/ can be reached from a literal l\ of JCu(l) in II, that will also hold for every I.E.S. of II. 
Since CCn(Z) is the largest connected componente including /, this path from l\ to _L or -il 
cannot re-enter CCn(Z). Therefore, which clauses are chosen from CCu(l) to be in the I.E.S., 
and which clauses are chosen in the outside of CCu(l) do not interact with each other. 

For each I' G JCjiil) we define LCn(l') to be the set of literals that can be derived from 
I using unit propagation. Since I' is not in CCn(Z), unit propagation cannot include literals 
in CON- 
NOTATION: LC u (l) = {V I 3/ G JCn(Z) and II U {-.Z} ^c/p 

The idea is that all literals of CCu(l) imply _L in II, and the same must therefore happen 
in any I.E.S. of II. Moreover, any I.E.S. of II must allow deriving all literals of LCu(l') 
from /', and the clauses involved in this unit propagation cannot contain literals in CCn(Z). 
Therefore, we can choose the clauses in CCn(i) assuming that each I' G JCu(l) implies 
LCu(l') regardless of this choice. 

In the figure above, the nodes in CC n (i) are all connected to each other. The nodes 
/1, . . . ,/ 6 form JC n (/), which is the "frontier" of the rest of the graph. Since I implies _L, 
LCu(l) contains either a node like l±, which implies _L or ->l alone, or a pair of nodes like I2 
and Z3, that imply a pair of opposite literals. Any such pair may be reachable either from a 
single node of CCu(l), like l 2 and / 3 , or from two different nodes, like / 4 and l 5 . 

The idea is that any I.E.S. will allow deriving _L or -1/ from l\ without using any clause 
with literals in CCn(/), as otherwise l\ would be in CCu(l) as well. For the same reason, 
it will be possible to derive I7 from 1%, etc. Therefore, we can choose the subset of clauses 
with literals in CCu(l) first, and then complete the I.E.S. by independently selecting literals 
in the rest of the graph. Therefore, once we have chosen clauses with one literal in CCn(Z), 
the problem reduces to the acyclic case. 

How can we choose a subset of edges joining nodes of CCn(/) to form a minimal I.E.S.? 
In this subset, every literal in CCn(Z) must imply _L. If a subset has this property, all other 
clauses whose edge starts from a node of CCu(l) are redundant. We consider three cases 
separately, and establish the size of minimal I.E.S. in each case. 

JCn{l) contains a literal that implies _L or -1/ in II. In the example above, l± is in this 
condition. We prove that the minimal size of I.E.S. 's is equal to the number of literals 
ofCCn(Z). 

Choose a node that is connected to l±, and add the edge from this node to l±, and 
proceed recursively adding edges only for nodes that are not already connected to l\. 
This algorithm makes all nodes of CC n (i) connected to l\ because CC n (i) is strongly 
connected: indeed, each node is visited, and an edge is added to connect it to l\ if it 
is not already. Moreover, for each node the resulting set of edges contains at most one 
outgoing edge. This proves that the I.E.S. has size equal to that of CCu(l)- No I.E.S. 
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can be smaller: a smaller set of edges necessarily leaves one node not joined to any 
edge, thus making it not implying _L as required. 

A node of CCu(l) is connected to two nodes of JCu(l) that imply _L. In the exam- 
ple above, I2 and Z3 are two nodes in this condition. We assume that no node of JCn(Z) 
implies _L or that is, we are not in the case above. 

We prove that all nodes of CCn(i) can be made connected to _L by choosing a number 
of edges that is equal to the size of CC n (l) plus one. Such a set of edges exists because 
all nodes can be connected to _L by selecting the two edges to l 2 and to Z 3 and then 
repeating the algorithm of the previous case from the common predecessor. 

We now prove that no smaller subset of edges makes all nodes of CCu(l) connected to 
_L. Let us assume the converse: there exists a subset of |CCn(Z)| edges that makes all 
nodes of CCu(l) connected to _L. Since each node has to be connected to some other 
node, we have that each node I' of CC n (/) is connected to exactly one other node I". 
This other node can be either in CC n (/) or in JC n (/). In the second case, V is only 
connected to I". In order for V to be connected to _L, then I" must be connected to _L 
or -1/ as well, contradicting the assumption that no node of JCu(l) is connected to _L 
or -1/. We can therefore conclude that any node of CCn(i) is connected to exactly one 
other node of CCn(i)- However, CCn(Z) cannot contain a pair of opposite literals, as 
otherwise II would be inconsistent. As a result, no node of CCu(l) is connected to _L, 
contradicting an assumption. 

Two nodes of CCu(l) are joined to two nodes of JCu(l) implying _L. Here, we assume 
that we are not in one of the two cases above: no single literal of JCu(l) is connected 
to _L, and no single literal of CCn(l) is connected to two literals of JCn{l) implying 
two opposite literals. In the example above, the two nodes I5 and I7 of JCu(l) are in 
the conditions we assumed. We are also assuming that no pair of nodes are like I2 and 
Z3, and that no single node is like l\. 

Let us assume there is only one pair of nodes in the conditions we assumed. We show 
how subset of minimal size connecting all nodes of CC n (l) to _L are made. First, take 
the edge from Z 4 to l§ and from l e to I7. If there exists a simple cycle containing both Z4 
and take its edges, and then visit the rest of CCu(l) backwards by adding edges from 
nodes that are not already connected to both U and Iq. This set of edges is composed 
of \CCu(l) \ + 1 edges, and no smaller set maintains all nodes of CCu(l) connected to 
_L. 

The point is that l 4 has to be connected to Z 6 and vice versa. Therefore, any I.E.S. 
contains a cycle including / 4 and Iq. Simple cycles have exactly one edge for each node, 
and are therefore optimal from this point of view, as all nodes of the cycle have to be 
connected to U and Iq anyway. 

If there is no simple cycle including both Z 4 and / 6 , we consider a cycle composed of 
more than one simple cycle. If two cycles have one or more common points, we have 
one more edge w.r.t. the optimal situation. Therefore, we have to minimize the number 
of common points among cycles. 
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In the first two cases, finding the minimal size of i.E.s.'s is easy, as it amounts to counting 
the number of nodes in CCn(l), and adding one in the second case. In the third case, however, 
we have to check the existence of a simple cycle including two nodes. This problem is NP 
complete. 

Theorem 5 Deciding whether a graph contains a simple cycle including two given nodes is 
NY-complete. 

Proof. Membership is ovious via a guess-and-check algorithm. Hardness is probed by re- 
duction from the problem path via a node. This is the problem of determining whether there 
exists a simple path from a node x to a node y that includes a node m, in graph G. This 
problem is NP complete [LP84]. 

Given a graph G and three nodes x, y, and m, we build a graph G' by first removing all 
edges that are incoming to x, and then adding the edge from y to x. This graph contains 
a simple cycle including both x and m if and only if there exists a simple path from x to y 
including m. 

First, the removal of incoming edges to x does not change the simple paths from x to y, 
as no such path, being simple, contains an edge that is incoming to x. For the same reason, 
the addition of the edge from y to x does not change the set of simple paths from x to y. 
However, since the edge from y to x is the only incoming edge to x, it is contained in any 
cycle including x. Therefore, any simple cycle including x is composed of a simple path from 
x to y and of the edge from y to x. Therefore, a path from x to y including m exists if and 
only if a simple cycle including both x and m exists. □ 

This theorem shows that the problem of checking whether a set of clauses implying a 
single literal, and containing cycles in the graph induced by the negation of this literal, is 
NP-complete. 

Theorem 6 Deciding whether a consistent 2CNF formula implying a single literal I such 
that the graph induced by ->l on the graph contains cycles has a I.E.S. of size k is NP-complete. 

Proof. Hardness is proved by reduction from the problem of checking the existence of a 
simple cycle in a graph containing two given nodes x and y. First, check whether y is 
reachable from x and vice versa: if not, there is no simple cycle including both x and y. The 
second step is the deletion of any node such that x is not reachable from. This step clearly 
does not alter the set of cycles including x. Then, build the set of clauses obtained from the 
edges of the graph, and then two clauses ->x V z and ->y V ->z. All literals imply _L because 
all nodes are connected to x which is connected to y. Set k to the number of nodes of the 
graph plus one. The original graph has a simple cycle including both x and y if and only if 
we can build an I.E.S. for the set of clauses that is made of a simple cycle including x and 
y, from the clauses ->x V z, and ->y V ->z, and one clause for any other literal. □ 
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5.3 Size of I.E.S.: Cyclic Consistent Formulae not Implying Liter- 
als 

The problem of determining whether a formula has an I.E. s. of size at most k is NP-complete 
if the formula is consistent and cyclic. This is in particular true even if the formula does not 
entail any literal. 

Theorem 7 Deciding whether a set of binary clauses IT has an equivalent subset of size 
bounded by k is NP complete. This result holds for sets of clauses that implies that all 
variables are equivalent and does not entail any single literal. 

Proof. The problem of finding a minimum equivalent subgraph is NP-complete. In particular, 
it remains complete even if the graph is strongly connected [Sah74, KRF95]. This is the 
problem of finding the minimum number of edges of the graph that makes the resulting 
graph strongly connected. 

We show a reduction from this problem to that of finding whether there exists a i.E.s. of 
a set II of size bounded by k. Let G = (JV, E) be a graph. The set of variables of IT is the 
set of nodes N. For each edge the set IT contains the clause -A V j. For any two nodes 
i and j, the node j is reachable from i if and only if IlU {i} \=up j- In order to complete the 
proof, we only have to show that II \£ I for any literal /: if this is true, then reachability is in 
one-to-one correspondence with entailment, thus showing that equivalence of graphs implies 
equivalence of the corresponding formulae. 

Since each clause contains a positive and a negative literal, it is satisfied both by the 
model setting all variables to true and by the model setting all variables to false. These 
are therefore both models of IT. However, if one of them satisfies /, the other one does not. 
Therefore, there is a model of IT that does not satisfy I, thus showing that I is not implied 
by IT. □ 

Note that the set of clauses we used in the proof makes all variables equivalent. This is 
necessary, as the proof of hardness of the minimum equivalent subgraph holds relies on the 
graph being strongly connected. 

6 Presence in an I.E.S. 

In this section, we study the problem of checking whether a clause is contained in some 
I.E.S. 's of a given 2CNF formula. The cases of consistent acyclic sets have already been 
considered, and the problem proved polynomial in these cases. This result can be slightly 
extended to the case in which some cycles are present, but none include the clause to check. 
We then consider the problem of acyclic inconsistent. For consistent cyclic formulae, we 
show that the problem is NP-complete regardless of whether some literals are implied by the 
formula. The problem has the same complexity if the set if inconsistent. 
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6.1 Presence in an I.E.S.: Acyclic Inconsistent 2CNF Formulae 



In this section, we study the problem of telling whether a clause is in an i.E.s. of a cyclic 
inconsistent formula. We prove that this problem is polynomial. 

In the particular case when II is inconsistent, an i.e. s. of II is an irredundant inconsistent 
subset of II. A clause is in some I.E.s. of an inconsistent formula if there is an inconsistent 
subset of II that contains this clause, and that becomes consistent if this clause is removed 
from it. Formally, l± VZ 2 G II is in some I.E.S. of II if and only if there exists IT C H\{1\ V/ 2 } 
such that n' \f= ± but IT U {k V l 2 } |= _L. 

If II\{Zi V Z 2 } is consistent, then II\{Zi V Z 2 } \/= h V Z 2 , and 7 is therefore in all i.E.s.'s 
of II. From now on, we only consider formulae II such that IT\{Zi V l 2 } is inconsistent. A 
necessary condition to ensure the presence of the clause Zi V l 2 in an I.E.s. of II in this case 
is given as follows. 

Lemma 17 If l\ V l 2 is in an I.E.s. of an inconsistent acyclic 2 CNF formula IT such that 
H\{h V l 2 } is inconsistent, then: 



Proof. We assume that the first equation above is false and prove that l\ V l 2 in not in an 
I.E.s. of II. A similar proof can be used assuming that the second equation is false. 

Let II' be a consistent subset of H\{h V l 2 }. Since IT\{/i V l 2 } U {l±} \{=up -L, we have 
that IT U {/1} y=\jp -L. As a result, II' is consistent with l\ and therefore it is also consistent 
with li V l 2 . In the other way around, if II" is an inconsistent subset of II, then LI"\{Zi V l 2 } 
is inconsistent as well. As a result, no I.E.s. of II contains l± V l 2 . □ 

The converse of this lemma is not true, as shown by the following formula. 



The graphs of this formula induced by Z x and l 2 are as follows; clearly, _L is reachable 
from both li and l 2 in II. 



n\{z 1 vz 2 }u{z 1 } h/pT 
n\{Zi vz 2 }u{z 2 } hc/pT 



n = {h V Z 2 , -.Zi v z 3 , -.z 3 V x, ^z 3 , y^x, -.z 2 v ^Z 3 , Z 3 V y, Z 3 V ^y} 



O 





Q 




Graph of II induced by l-y. 



Graph of II induced by Z 2 . 
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While _L is reachable from both l-y and l 2 in n, yet l\ V l 2 is not necessary to produce 
inconsistency in any subset of II. This is because any subset of II allowing _L to be reached 
from l\ and Z 2 also allows _L to be reachable from l 3 and -1/3 without using the clause h V l 2 . 

This formula is cyclic: a simple cycle of clauses is {l\ V l 2 , -<l 2 V -1/3, 13 V ->h}. We indeed 
prove that the lemma above can be turned into an "if and only if" whenever II is an acyclic 
formula. 

Lemma 18 The clause li V l 2 is in some i.E.s. of the acyclic inconsistent 2CNF formula II 
if the following conditions hold: H\{h V l 2 } |= -L, n\{/! V / 2 } U {/1} \=up -L ; and H\{h V 
h} U {/ 2 } hc/P -I- 

Proof. Since n\{h V l 2 } U {/1} |=t/p -L, there exists a path from li to a pair of opposite 
literals l 3 and -1/3 in lT\{lx V Z 2 }, and the same holds for l 2 . Let IT be the set composed of 
exactly all clauses used in the unit propagation from li to _L and from l 2 to _L. We have that 
LT' |= _| / 2 }. Therefore, lTu{Zi V/ 2 } is inconsistent. We prove that II' is consistent, thus 
proving that l± V l 2 is in all i.E.s.'s of IT and therefore in some i.E.s.'s of LT. 
The paths from li and l 2 to _L can be visualized as follows: 

h h h h 

o — o — 0-/3 o — o — 



The idea is that we can set to false all literals of these two paths besides the final one of 
each one, and this would be a model of IT. However, these two paths can share variables; 
this assigment might not be a model in this case. 

These two paths do not contain opposite literals besides 1% and I4. Indeed, if the path 
from li contains a pair of opposite literals besides Z3, -1/3, then contradiction is reached before 
the end of the path. The same holds for the path from l 2 . If the paths from li contains Z5 
and the path from l 2 contains -1/5, then from li we can reach l 5 and from l 5 we can reach 
— 1Z2; together with the clause li V l 2 , we would have a cycle while II is assumed acyclic. 

As a result, these two paths can share literals and clauses, but cannot contain a pair of 
opposite literals. A satisfying truth assignment is obtained as follows: 

1. set all literals to false; 

2. set the final literal of each path to true; 

3. for every node that is set to true, set its successor (if any) to true. 

The third point is necessary because the two paths can share nodes. This assigment sets 
two opposite literals to true only if both Z 4 and -1/4 are reachable from ->Z 3 or both Z 3 and -1/3 
are reachable from -1/4. We show that the first situation is impossible; a similar proof holds 
for the second situation. 
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Since I4 and -1/4 are reachable from ->l 3 , we are in the following situation: 



h 

o- 



h 

o 



-/ 3 
o- 



k 

o 



- 1/4 

o 




/ 2 



This situation is impossible because it implies the existence of a cycle in II: from I2 we 
can go to -1/3, and from -1/3 we can go to — 1Z1 . The clause li V I2 closes the cycle. □ 

The proof of the theorem above does not hold for cyclic formulae because a literal can 
be reachable from l\ while its negation is reachable from I2. As a result, the conditions that 
both li and l 2 are connected to _L are not sufficient: we have also to check that there is a set 
of edges that connect them to _L while no literal is such that both it and its negation reach 
_L In fact, it can be proved that the problem is NP-hard for cyclic formulae. 



6.2 Presence in an I.E.S. 
Implying Literals 



Acyclic Consistent 2CNF Formulae not 



We have already shown that an acyclic consistent 2CNF formula not implying a single literal 
have a unique I.E.S., which can be determined in polynomial time. Therefore, checking the 
presence of a clause in an I.E.S. is easy. In this section we extend this result to the case in 
which the formula contains some cycles, but none include the clause under consideration. 

The following figure explains the concept: if we box all literals that are equivalent to l±, 
and all literals that are equivalent to I2, then l\ — > Z 2 is in some I.E.S. if and only if all paths 
from one box to the other one only contains nodes in the boxes. 



h o 



equivalent to l\ 



I 

o 



■x- 



Oh 



equivalent to I2 
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The idea is that, removing all the direct edges from the l\ box to the Z 2 box, then the edge 
l\ — > I2 is the only one that makes the boxes connected. This edge is therefore necessary in 
a subset II' of II that is equivalent to it. It is therefore in all i.E.S.'s of II', which are some 
i.e.s.'s of n. 

On the other hand, if there is a path that includes a node / not in the boxes, then any 
I.E.S. includes a path from the l\ box to I and from I to the Z 2 box; otherwise, the node 
would not be either reachable from the l± box or the I2 box would not be reachable from it. 
The presence of such path therefore makes l\ — > I2 always redundant. 

Lemma 19 IfU\/= {l\ = I2), then l\ — > I2 is in a I.E.S. if and only if all paths from l\ to I2 
contain only literals that II makes equivalent either to l\ or to Z 2 . 

Proof. Let us assume that all paths from li to Z 2 only contains literals that are equivalent 
to either li or to l 2 . Since li and Z 2 are not equivalent, any such a path can be written as 
(li, . . . , l 3 , U, . . . , l 2 ), where IT makes literals l±, . . . , l 3 equivalent and literals Z 4 , . . . , / 2 equiv- 
alent. 

Let us now remove the edge / 3 — > Z 4 . By assumption, Z 4 can still be reached from / 3 by 
first going to l±, then to / 2 , and then to Z 4 . As a result, one path from l\ to l 2 has been 
removed while maintaining equivalence with II. 

Iterating this process over all paths from l x to Z 2 , we end up with a set of clauses whose 
only path from li to Z 2 is the single edge l\ — > Z 2 . This clause is now irredundant: removing 
other redundant clauses, we obtain a I.E.S. with l\ — > l 2 . 

Let us now assume the converse: there is a path from Z 4 to l 2 that contains some literals 
that are not equivalent to / x nor to Z 2 . Such a path can be written as: (Zi, . . . , Z 3 , / 4 , . . . , / 5 , Z 6 , . . . , Z 2 ), 
where h, . . . , I3 are equivalent to each other, as are Iq, . . . , Z 2 , but are not equivalent to any 
literal in U, . . . , l 5 . We prove that no I.E.S. contains the edge l\ — > Z 2 . 

Let IT' be an equivalent subset of II. Being equivalent to II, it must contain a number of 
cycles that make all literals of li, . . . , 1% equivalent to each other and all literals of 1$, . . . , / 2 
equivalent to each other. Note that l\ — > Z 2 cannot be in one of such cycles; otherwise, we 
would have a cycle joining both l\ and / 2 , which would prove that they are equivalent. 

Since II j= Zi — >■ Z4 and II |= U — > l 2 , the set II' must contain a path from li to Z 4 and a 
path from Z 4 to / 2 . If the first path includes l\ — > Z 2 , then we would have a path from Z 2 to 
Z 4 . Since II contains a path from Z 4 to Z 2 , then / 4 and Z 2 would be equivalent. For the same 
reason, the path from Z 4 to Z 2 do not contain the edge l\ — > l 2 - 

As a result, we have proved that II' includes some sets of edges that do not contain 
h — > h but allow to conclude that h, . . . , I3 are equivalent to each other, that / 6 , . . . , / 2 are 
equivalent to each other, that l\ implies / 4 , and that / 4 implies Z 2 . The edge l\ — > / 2 is 
therefore redundant. □ 

This lemma implies that checking whether l\ — > / 2 is in some I.E.S. is easy if l± is not 
equivalent to / 2 . Indeed, the set of nodes in the paths from l x to / 2 can be found by intersecting 
the set of nodes that are reached from / x and that of the nodes that l 2 is reachable from. If 
this intersection contains a literal that is not equivalent to l± nor to / 2 , then the edge l\ — > Z 2 
is redundant. 
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6.3 Presence in an I.E.S.: Cyclic Inconsistent 2 CNF Formulae 

The problem of deciding the presence of a clause in an I.E. s. of an inconsistent 2 CNF formula 
is NP-hard if the formula contains cycles. We use this simple preliminary lemma. 

Lemma 20 The problem of checking the existence of a simple path from node x to node y 
in G including a given edge is NP -complete. 

Proof. Membership is obvious. Hardness is proved by reduction from the problem path via 
a node: given a graph G and three nodes x, y, and to, decide whether there exists a simple 
path from x to y including the node to. 

The reduction is as follows: replace the node to with two nodes m 1 and to 2 joined by 
an edge. For every edge (n, to), add the edge (n, TOi). For every edge (m,n), add the edge 
(m 2 ,n). 





By construction, every simple path containing the node to in the original graph contains 
an edge that is incoming to to and is outgoing from to. This is possible if and only if we 
have a simple path in the new graph including the edeg (m 1) m 2 ). □ 

We can now prove that the problem of presence of a clause in an I.E.S. is NP-complete 
if the formula is inconsistent and cyclic. 

Theorem 8 Deciding whether a cyclic inconsistent 2 CNF formula has an I.E.S. that con- 
tains the clause a given clause is NP-complete. 

Proof. Membership is obvious by gussing a subset of the formula and then checking whether 
it is an I.E.S. and contains the given clause. Hardness is proved by reduction from the 
problem of checking whether a graph contains a simple path from a node to another one 
that includes a given edge. 

Given the instance of the original problem (G, x, y, (a, b)), in which we want to determine 
whether there exists a simple path from x to y including the edge (a, b), we build a formula 
n as follows: for each node of the graph, we have a literal; for each edge (n, to) of the graph, 
we have the clause ->n — > m; finally, we add the following clauses, where z and w are new 
variables: 



{y -> z,y 



w, -ix 
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Graphically, II is as follows. 




The part of the formula corresponding to the left part of the graph is satisfiable by setting 
all variables it contains to false. Indeed, this assigment makes all literals of the graph but ->z 
false; since all clauses contain at least one negated literal, they are all satisfied. By setting 
x to true, the part of the formula corresponding to the graph on the right is satisfied. 

Since both parts of II are satisfiable, and x is the only variable that is common to them, 
we have that x is the variable such that IlU {x} \=up -L and IIU {~<x} \=up -L. Since z is the 
only variable that occurs both positively and negatively in the first part of the formula, we 
have that y must be reachable from x in any i.E.s. of II. As a result, there exists an acyclic 
path including the edge (a, b) if and only if the edge -<a V b is in some I.E.s. of II. □ 

6.4 Presence in an I.E.S.: Cyclic Consistent 2CNF Formulae Im- 
plying Literals 

The i.E.s.'s of a consistent acyclic set can be compactly expressed as a number of independent 
choices. This makes the problem of checking the presence of a clause in an I.E.s. easy. In 
this section, we extend this polynomiality result to the case in which a literal of the clause 
to check is implied by the formula but the clause is not contained in any cycle of clauses. 
We also prove that the problem is instead NP-complete if the clause is contained in a cycle. 

Theorem 9 The problem of deciding whether ->li V l 2 is in an I.E.s. of a consistent 2CNF 
IT is polynomial if II |= and li is not reachable from l 2 . 

Proof. The proof is similar to that of Lemma 15 with CCu{h) in place of l\ alone. The set 
CCu(h) of literals that are in a cycle with li can be determined in polynomial time. The 
clause -ill V l 2 is part of an I.E.s. of II if and only if II U {l 2 } \=up -L or II U {l 2 } \=up ~^3> 
where l 3 G CC n (l). 

This algorithm is correct because there is no path from l 2 to l\. Therefore, there is no 
path from l 2 to any node of CCu(l)- As a result, the derivation from l 2 to _L or to -1/3 do 
not contain any literal in CCu(h)- 

Let LT' be an i.E.s. of LT. The derivations from l 2 to _L or to -1/3 still hold in II' and do not 
involve literals in CCu(h)- If -L is derivable from l 2 , we can obtain an I.E.s. by by removing 
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all clauses -1/ V I' with I e CCn(ii) and adding -1^ V Z 2 and a minimal number of clauses 
to make l 2 reachable from any literal in CCu(h)- This addition makes the negation of all 
literals of CCu(h) entailed. This is therefore an I.E.S. because all clauses that have been 
removed contain the negation of a literal in CCu(h)- 

A similar proof can be given for the case II U {/ 2 } \=up ~<h- □ 

This theorem is about a clause -1/1 V Z 2 that connects a node of CCu(h) with a node 
outside it. Rewriting the clause as l\ — > I2 makes the idea more evident: the clause can 
be viewed as an edge that starts from CCu(l), which is a set of literals all connected to 
each other, but ends outside CCu(h)- We now consider the case of a clause -1/1 V I2 that is 
"internal" to CC(h), that is, both li and l 2 are in CC(l\). This conditions is equivalent to: 
there exists a cycle including both li and I2, which can be checked in polynomial time. The 
problem of presence of this clause in an I.E.S. is NP-complete. 

Theorem 10 The problem of checking the presence o/n^V^ in some I.E.S. 's of a consistent 
2CNF formula II such that II |= -1/1 is NP '-complete if Til) {k} \=up h- 

Proof. The proof is by reduction from the problem of deciding the existence of two vertex- 
disjoint paths in a directed graph, which is NP-complete [EIS76, FHW80] (the corresponding 
problem for undirected graphs is polynomial [RS04].) This is the problem of estabilishing 
whether a graph G contains a path from node si to node t 2 and a path from node s 2 to 
node t 2 and these two paths do not share nodes. We can assume that, from each node of the 
graph, either ti or t 2 is reachable. The formula we consider is the one corresponding to the 
following graph. 




This formula implies the negation of all positive literals beside x. Indeed, the pair of 
nodes x and ->x can be reached from any other node of the graph. The same property must 
therefore be true for all I.E.S.'s of this formula. Since x and ->x form the only pair of opposite 
literals in the graph, the property holds only if both Z 3 and I4 are reachable from any other 
node besides x and ->x in the graph corresponding to the I.E.S. 
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If G has a pair of vertex-disjoint paths from s\ to t\ and from s 2 to t 2 , respectively, an 
i.e.s. containing the clause corresponding to the edge l\ — > l 2 is the subformula containing 
the clauses corresponding to the following edges: 




Additionally, the i.e.s. contains a number of edges making either t\ or t 2 reachable from 
every node of G not having already this property: this can be done by adding an edge s — > t 
if either t\ or t 2 is reachable from t but not from s. The addition of these edges does not 
make t 2 reachable from s\ or t\ from s 2 because an edge s — > t is only added if no other edge 
outgoing from s is already in the I.E.S. 

In this subgraph, both Z 3 and U are reachable from l\ and l\ is reachable from any other 
node of the graph. Since l<± is only reachable from l-y via the edge li — > l 2 , the corresponding 
clause is irredundant. Since -1/1 V Z 2 is irredundant in this subformula, every I.E.S. of this 
subformula contains this clause by Property 1; since these i.E.s.'s are also I.E.S. 's of the 
original formula, we have that -1/1 V / 2 is contained in some I.E.s.'s of the original formula. 

Let us now assume that G contains no vertex-disjoint paths from si to t± and from s 2 to 
t 2 , respectively. We prove that -1/1 V Z 2 is redundant in every I.E.S. of the formula. Since x is 
the only variable that occur both direct and negated in the graph, in every I.E.S. of II both 
l 3 and Z4 are reachable from any other node of the graph besides x and ->x. The following 
are the edges that are necessarily contained in any I.E.S. of II because either they are the 
only outgoing edges of a node or they are the only edges that are incoming to I3 or / 4 . 
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Since s 2 is connected to nodes outside G, it is either connected to t\ or to t 2 . If t\ 
is reachable from s 2 , then both l 3 and I4 are reachable from li, and the edge li — > Z 2 is 
redundant. We can therefore consider only the case in which s 2 is connected to t 2 but not 

to ti. 




In this graph, t\ is not reachable from s 2 using only edges inside G. The other nodes of 
the graph can therefore be connected to I4 only via a path from Si to t\. By assumption, 
however, every path from s\ to t\ shares a node with the path from s 2 to t 2 , which makes t\ 
reachable from s 2 , contradicting the assumption. □ 



6.5 Presence in an I.E.S.: Cyclic Consistent 2CNF Formulae not 
Implying Literals 

The problem of presence of a clause in an i.E.s. of a 2CNF formula is polynomial if the 
formula is acyclic and consistent. We now show that the same problem is NP-complete if 
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the formula is still consistent but cyclic. First of all, if a formula is consistent and does not 
entail literals, then every clause l\ V l 2 can be made irredundant by the following lemma. 

Lemma 21 IfU does not entail nor ->l 2 and contains the clauses ^Vw and ->w V l 2 , 
where w is a variable appearing only in these two clauses, then II is equivalent to ITU{7i Vl 2 }, 
and the two clauses l x V w and ->w V l 2 are irredundant in II. 

Proof. Equivalence is due to the fact that l x V l 2 is obtained from l x V w and -itu V l 2 by 
resolution. Irredundancy is due to the fact that ITU {-1/1} \=up w in II because IT is consistent 
and does not entail literals. Since l\\/w is the only clause containing w positively, this clause 
is necessary. For the same reason, ->w V l 2 is necessary. □ 

If a clause is necessary in a formula, it is containing in all its i.E.S.'s. Whenever we have 
a formula II that is consistent and does not entail literals, and we want a clause l\ V l 2 to 
be contained in all its I.E.S.'s, we can then replace it with l± V w and ->w V l 2 . The original 
clause is still implied by these two new ones by resolution, but the new clauses are necessary. 
In order to keep proofs simple, we use the following notation. 

o 

Notation: / x V l 2 means {/1 V w, ->w V l 2 } where w is a new variable 

We use this notation because the clauses {h V w, ->w V l 2 } actually represent the single 
clause li Vl 2 since w is not used anywhere else. The circle over the symbol V reminds us that 
these two clauses cannot be removed from a formula without changing its semantics. We can 
now prove that deciding the presence of a clause in an i.E.s. of a formula is NP-complete. 

Theorem 11 Deciding whether a clause is in an I.E.s. of a cyclic and consistent 2CNF 
formula II is NP-hard even if no single literal is implied by II and II makes all literals 
equivalent. 

Proof. We show a proof of hardness from 3sat. The set of clauses generated by this particular 
reduction is such that all clauses are of the form ->l V I'. Such formulae can be represented 
by their induced graphs. In this proof, we use I — > I' to denote ->l V V and I — > V to denote 

o 

-1/ V V . We also use I V to denote the reachability of V from I and I 4=> V to denote that I 
and V can be reached from each other. 

Given a set of clauses Y = {71, . . . , r y m }, we generate a formula II and one of its clauses 
h h in such a way l\ — * l 2 in is in some I.E.S.'s of II if and only if V is satisfiable. 
The graph corresponding to IT is strongly connected. In particular, truth assignments on V 
correspond to subsets of II in which l 2 =^ n =^ l\ for every node n. Therefore, the graph is 
strongly connected if and only if ^ =^ l 2 . The truth assigment satisfies V if and only if l x =^ l 2 
does not hold in the corresponding subformula, thus making the clause l\ — > l 2 necessary. 

From now on, we consider II as a graph. The graph corresponding to Y is as follows: for 
each variable Xj, we have three nodes Xi, xf, and x~ . For each clause 7^ we have a node Cj. 
The edges of II are the following ones: 

1. Nodes forming a strongly connected component containing li. 
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(a) l x A Xi ] 

(b) Xi — > x~l and — > 

(c) :r+ A ^ and :r~ A 

2. Nodes forming a strongly connected component containing l^: 

(a) l 2 —> xf and Z 2 — » 

(b) x+ -> Cj- if x< G 7j-; 

(c) x~ — > Cj if -<Xi G 7-,-. 

(d) 7j A fe; 

This graph is strongly connected. This must also be true for every graph representing an 
i.e.s. of II. Graphically, the clause 71 = x 1 V -ix 2 is represented as in Figure 1. The nodes 
li and l 2 have been omitted to keep the figure simple: if a node is missing at the left of an 
arrow, it is l±] if it is at the right, it is Z 2 . Arrows marked with a circle cannot be removed 
while looking for an I.E.S. of this formula. 




Figure 1: The subgraph corresponding to 71 = X\ V -1X2. 

Every truth assigment on T can be associated to a set of edges to remove from II to the 
aim of obtaining an i.e. S. In particular, if II is satisfiable we can remove some of the edges 
in such a way I2 =>• n =>• l\ still hold for every node n, but l\ =>- I2 does not. This way l\ — > I2 
is necessary to make this graph strongly connected and therefore equivalent to the original 
one. 

Let us assume that Y is satisfiable, and let M be one of its models. We build a subset of 
II by removing the following edges: 

1. if Xi is positive in M, remove the edge Xi — > xf and all edges from x~ to a node Cj] 
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2. if Xi is negative in M, do the other way around. 

The following figure shows the edges that are removed from the formula above if M 
assigns true to both x\ and x 2 . 



X\ 

« — •O. 



h 



x^ 






-e- 


x^ 






-e- 


rr->~ I" 

Jb<2 






-e- 








-9- 



Cl 



In this subgraph, Z 2 =>• n h holds for all nodes. For nodes xf and x~ there are edges 
from l 2 to them and from them to l±. The cycle l\ =>- X; L =>- xf =>- l\ or the cycle with x~ 
replacing xf makes all nodes Xi in the same strongly connected component with l\ if Xi is 
negative in M. A node Cj is in the same strongly connected component with l 2 thanks to a 
cycle l 2 — > xf — > Cj — > Z 2 with is positive in M and in c^, or the similar cycle if the literal 
of Cj that is true in M is a negative one. 

Since l 2 =>• n =>- Zi holds for all nodes, the addition of the edge h — > Z 2 makes all nodes 
reachable from each other, making this formula equivalent to the original one. On the other 
hand, l\ =>- l 2 is not true in this graph: indeed, all paths l\ Xi — > xf — > Cj — > l 2 have been 
broken because either the edge Xi — > :r+ or the edge a;/" — > Cj have been removed, and the 
same for the similar path containing x~ in place of xf. 

Let us now assume that II has an I.E.S. IT containing the clause h — * l 2 , and show a 
truth assigment satisfying all clauses of T. Since the graph of II is strongly connected, the 
same holds for II'. Since removing l\ — > / 2 makes the graph of II' not strongly connected, we 
have that: 

1. l 2 is not reachable from l± in the graph corresponding to n'\{/! — > Z 2 }; 

2. every node is reachable from either li and l 2 and reaches either ^ and l 2 in the graph 
corresponding to n'\{Zi — > / 2 }: otherwise, the addition of Zi — > / 2 would not make the 
graph strongly connected. 

Since Xi is reachable from /i, it must reach either l\ or Z 2 ; however, it cannot reach l 2 as 
otherwise we would have l\=> l 2 . As a result, for every Xi, either Xi — > xf or Xi — > x~ is in 
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IT 7 . In other words, it cannot be that both edges are not in IT' for the same x,. As a result, 
the following is a partial model (a consistent set of literals) . 

m = { Xi | Xi -> x~ & n'} u | Xi -> xf g n'} 

We show that all clauses of T are satisfied by M. Let 7, be a clause. Since the edge 
Cj — > l 2 is not removable, /i =>- Cj cannot hold, as otherwise l 2 would be reachable from l ± . 
As a result, l 2 =>• Cj must hold. This implies that there exists an index % such that either 
Xi G 7j and xf — > Cj G II' or -iXj G 7 and x^ — > Cj G II'. In the first case, rEj — > x+ G" II' as 
this edge would create a path from /1 to / 2 . This however implies that is set to true by 
M. Since Xi G 7,-, the clause 7, is satisfied by M. The case -ix* G 7, and #7" — > Cj G II' is 
similar. □ 



7 Horn Formulae 

When considering the complexity of redundancy for Horn formulae, the following problems 
are clearly polynomial. 

1. checking redundancy; 

2. a set is an I.E.S.; 

3. a clause is in all i.E.S.'s; 

4. uniqueness. 

The problem of size is easily proved to be NP-complete: indeed, the corresponding proof 
for the case of consistent acyclic 2CNF formulae not implying single literals uses clauses 
corresponding to the edges of a graph whose nodes are all positive literals. These clauses are 
therefore all in the form ->x V y, that is, they are binary Horn clauses. 

We show that the problem of size is NP-complete for inconsistent Horn formulae. 

Theorem 12 Deciding whether a Horn formula II has an equivalent subset of size k is 
NP-complete. 

Proof. Membership if obvious. Hardness is proved by reduction from vertex cover. Let G 
be a graph. We build the following set of Horn clauses: for each node % we have a unit clause 
x^ for each edge z = we have two clauses xi — > a z and Xj — > a z ; finally, we have the 
clause -iai V • • • V ->a m . 

Inconsistent subsets of this formula are composed of ->ai V • • • V -<a m , plus a pair x,i and 
xi — > a z for each edge z. This means that we have exactly one clause — > a z for each edge of 
the graph. Moreover, for each edge we have to include a unit clauses corresponding to one of 
its incident nodes. Therefore, minimal inconsistent subsets are in one-to-one correspondence 
with vertex covers of the original graph. Namely, G has a vertex cover of k nodes if and 
only if the formula has an inconsistent subset of m + 1 + k clauses, where m is the number 
of edges of G. □ 
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Notations 



C u (i) = {V | -,z v i' e n} 

D n (Z) = { 7 en | 7 = -.zvz'} 

i2 n (0 = n\£> n (0 

M n (Z) = {/' G C n (Z) | ^Z" G C n (Z) such that R n (l) U {/"} ht/P 

5 n (0 = {/' G C n (Z) | i? n (0 U {I'} \= UP J.} 

Pn(0 = {(h, k) I Zi, / 2 e C n (Z) and i? n (0 U {/J |=i/p ^ 2 }\S n (0 

CCn(0 = 0' M an( i ^' are i n a cycle} 

JCn(Z) = {/' | ^Z" Vl'en and Z" G CC , n(/)}\CC n (/) 

LCn(Z) = {Z' G JCn(Z) |nu{Z}hc/p/'} 
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